This commit adds a test where we trigger a situation which would
previously make the link think it was never in sync, and potentially
create a lot of empty state updates. This would happen if we were
waiting for a revocation, while still receiving updates from the remote.
Since in this case we could not ACK the updates because of the exhausted
revocation window, our local commitchain would extend, while the remote
chain would stall. When we finally got the revocation the local
commitment height would be far larger than the remote, and FullySynced
would return false from that point on.
This commit adds a new test that makes sure we don't try to send
commitments for states where there are now new updates. Before the
recent change to FullySynced we would do this in this test scenario, as
the local an remote commitment heights would differ.
The test makes the local commitment chain extend by 1 vs the remote,
which would earlier trigger another state update, and checks taht no
such state update is made.
This commit removes a faulty check we did to determine if the channel
commitments were fully synced. We assumed that if out local commitment
chain had a height higher than the remote, then we would have state
updates present in our chain but not in theirs, and owed a commitment.
However, there were cases where this wasn't true, and we would send a
new commitment even though we had no new updates to sign. This is a
protocol violation.
Now we don't longer check the heights to determine if we are fully
synced. A consequence of this is that we also need to check if we have
any pending fee updates that are nopt yet signed, as those are
considered non-empty updates.
This commit alters the neutrino chainview such that it
caches the filter entries corresponding to watched
outpoints at the moment they are added to the filter.
Previously, we would rederive each filter entry when
reconstructing the relevant filter entries, which
would lead to unnecessary work on the gc. Now, each is
created at most once, and reused across subsequent
reconstructions.
This commit makes the call to forwardBatch after locking
in Adds synchronous. This ensures that circuits for any Add
packets are added to the switch in the same order that they
are prescribed in the channel state. Though it is very unlikely
this case would arise, it may happen under more greater loads.
In addition, this also makes some trivial optimizations wrt. to
not spawning unnecessary goroutines if no settle/fail packets
are locked in.
This commit adds a simple scheduling mechanism for
resolving potential deadlocks when dropping a stale
connection (via pubkey inspection).
Ideally, we'd like to wait to activate a new peer until
the previous one has exited entirely. However, the current
logic attempts to disconnect (and wait) until the peer
has been cleaned up fully, which can result in
deadlocks with other portions of the codebase, since
other blocking methods may also need acquire the mutex
before the peer can exit.
When existing connections are replaced, they now
schedule a callback that is executed inside the
peerTerminationWatcher. Since the peer now waits for
the clean exit of the prior peer, this callback is
now executed with a clean slate, adds the peer to
the server's maps, and initiates peer's Start() method.
This skips creating errChans when sending messages to
peer during broadcast. This should be a minor memory
optimization, as well as not requiring channel sends
on those which will never be read.
This commit attempts to resolve some potential deadlock
scenarios during a peer disconnect.
Currently, writeMessage returns a nil error when disconnecting.
This should have minimal impact on the writeHanlder, as the
subsequent loop selects on the quit chan, and will cause it to
exit. However, if this happens when sending the init message,
the Start() method will attempt to proceed even though the peer
has been disconnected.
In addition, this commit changes the behavior of synchronous
write errors, by using a non-blocking select. Though unlikely,
this prevents any cases where multiple errors are returned, and
the errors are not being pulled from the other side of the errChan.
This removes any naked sends on the errChan from stalling the peer's
shutdown.
Adds a new error ErrVBarrierShuttingDown that is returned
from WaitForDependants if the validation barrier's quit
chan is closed. This allows any blocked goroutines to
distinguish whether the dependent task has been completed,
or if validation should be aborted entirely.
This commit improves the shutdown of the router's
pending validation tasks, by ensuring the pending
tasks exit early if the validation barrier
receives a shutdown request.
Currently, any goroutines blocked by WaitForDependants
will continue execution after a shutdown is signaled.
This may lead to unnexpected behavior as the relation
between updates is no longer upheld. It also has the
side effect of slowing down shutdown, since we
continue to process the remaining updates.
To remedy this, WaitForDependants now returns an error
that signals if a shutdown was requested. The blocked
goroutines can exit early upon seeing this error,
without also signaling completion of their task to
the dependent tasks, which should will now properly
wait to read the validation barrier's quit signal.
In this commit, we ensure that any time we send a TempChannelFailure
that's destined for a multi-hop source sender, then we'll always package
the latest channel update along with it.
This commit make us return an error in case a restored HTLC from a
pending remote commit has an index that is different from our local
update log index. It is appended with the assumption that these indexes
are the same, and if they are not we cannot really continue.
This commit adds a call to panic in case the HTLC we are looking for is
not found in the update log. It _should_ always be there, but we have
seen crashes resulting from it not being found. Since it will crash with
a nil pointer dereference below, we instead call panic giving us a bit
more information to work with.
In this commit, we modify the NewUnilateralCloseSummary to be able to
distinguish between a unilateral closure using the lowest+highest
commitment the remote party possesses. Before this commit, if the remote
party broadcast their highest commitment, when they have a lower
unrevoked commitment, then this function would fail to find the proper
output, leaving funds on the chain.
To fix this, it's now the duty of the caller to pass remotePendingCommit
with the proper value. The caller should use the lowest unrevoked
commitment, and the height hint of the broadcast commitment to discern
if this is a pending commitment or not.
In this commit, we move a set of useful functions for testing channels
into a new file. The old createTestChannels has been improved as it will
now properly set the height hint on the first created commitments, and
also no longer accepts any arguments as the revocation window no longer
exists.
In this commit, we extend the CloseChannelSummary by also storing: the
current unrevoked revocation for the remote party, the next pending
unused revocation, and also the local channel config. We move to store
these as the provide an extra level of defense against bugs as we'll
always store information required to derive keys for any current and
prior states.
In this commit, we fix an existing flake within the set of revocation
integration tests. Right after Bob's restart, we attempt to force close
the channel. However, it may be the case that the chain arbitrator
hasn't yet been created. As a result, the request to force close the
channel will fail. We easily fix this by wrapping the force close
attempt in a WaitPredicate.
This commit is similar to a recent commit which attempts to account for
internal block races by mining a second block if the initial assertion
for HTLC state fails. This can happen again if by the time that the
sweeping transaction is broadcast, it doesn't make it into the next
block mine.
In this commit, we modify the
testMultHopRemoteForceCloseOnChainHtlcTimeout test slightly to attempt
to account for a block race between the arrival of a message betwen the
contract resolver and the utxo nursery. If this message arrives "late"
(relative to the speed with which we mine blocks in test), then it'll be
detected as such by the utxo nursery. However, since we attempt to mine
a precise number of blocks, if this happens, then we'll never actually
mine that extra block to trigger a broadcast of the sweep transaction.
In this commit, we fix a race in the set of TestChannelLinkTrimCircuits*
tests. Before this commit, we would trim the circuits in the htlcManager
goroutine. However, this was problematic as the scheduling order of
goroutines isn't predictable. Instead, we'll now trim the circuits in
the Start method.
Additionally, we fix a series of off-by-2 bugs in the tests themselves.
In this commit, we fix a bug that could at times cause a deadlock when a
peer is attempting to disconnect. The issue was that when a peer goes to
disconnect, it needs to stop any active msgStream instances. The Stop()
method of the msgStream would block until an atomic variable was set to
indicate that the stream had fully exited. However, in the case that we
disconnected lower in the msgConsumer loop, we would never set the
streamShutdown variable, meaning that msgStream.Stop() would never
unblock.
The fix for this is simple: set the streamShutdown variable within the
quit case of the second select statement in the msgConsumer goroutine.