In this commit, we fix a recently introduced bug. The issue is that
while we're failing the link, the peer we're attempting to force close
on may disconnect. As a result, if the peerTerminationWatcher exits
before we can add to the wait group (it's waiting on that), then we'll
run into a panic as we're attempting to increment the wait group while
another goroutine is calling wait.
The fix is to first check that the server isn't shutting down, and then
use the server's wait group rather than the peer to synchronize
goroutines.
Fixes#1285.
In this commit, we add a new index to the HTLC log. This new index is
meant to ensure that we don't attempt to modify and HTLC twice. An HTLC
modification is either a fail or a settle. This is the first in a series
of commits to fix an existing bug in the state machine that can cause a
panic if a remote node attempts to settle an HTLC twice.
Before the previous commit, we assumed the HTLC's timeout transaction
would be the only transaction in the mempool. In reality, after mining
some blocks for the HTLC to expire and waiting for the timeout
transaction to arrive in the mempool, at times we would instead detect
the funding output's sweeping transaction and proceed the test with this
assumption, leading to the case where we would have to mine extra blocks
to include the HTLC sweeping transaction. This has been resolved in the
previous commit, so this fix is no longer needed.
This reverts commit e54f1ea4dbe59b2e53a94774995ae1711746c2f8.
In this commit, we address an existing flake that would be triggered
when testing HTLC timeouts. After force closing a channel and generating
enough blocks to expire an HTLC, we would wait for a transaction to
arrive in the mempool and assumed it was the timeout transaction.
Instead, we'd detect the funding output sweep transaction and attempt to
proceed with the test with the incorrect assumption of the timeout
transaction being broadcast.
This commit adds an integration test that checks that in case a channel
counterparty tries to settle an HTLC with the wrong preimage, the
channel is failed and force closed.
This commit makes the peer aware of the LinkFailureErrors that can
happen during link operation, and making it start a goroutine to
properly remove the link and force close the channel.
Adds a new closure OnChannelFailure to the link config, which is called
when the link fails. This function closure should use the given
LinkFailureError to properly force close the channel, send an error to
the peer, and disconnect the peer.
This commit introduces a new error type LinkFailureError which is used
to distinguish the different kinds of errors that we can encounter
during link operation. It encapsulates the information necessary to
decide how we should handle the error.
In this commit, we rewrite the node announcement integration test to no
longer depend on a sleep interval. Instead, we use graph topology
updates in order to be notified exactly when we receive the node
announcement.
In this commit, we fix a race condition where at times we open a channel
between two parties and immediately try to send payments over it. At
times this would fail due to the channel link not being fully registered
in the HTLC switch.
The pending state definitin in ChannelCloseSummary was slightly changed
in such a way that channels that has had their commitment broadcasted
now is no longer considered "pending close". They now instead stay in
the open chan bucket with the ChanStatus "CommitmentBroadcasted" until
their commitment is confirmed. This commit updates the IsPending godoc
to reflect this.
In this commit, we fix an existing source of a panic, that could at
times lead to a deadlock. If the circuit returned from closeCircuit
didn't have an outgoing key (as it was an incomplete forward), then we
would attempt to de-ref a nil pointer. This would trigger a panic, and
the runtime would start to unwind the stack, and execute each defer in
line. A deadlock can arise here, as in the defer at the root goroutine,
we need to grab the fwdingEventMtx. However, we already have it at the
panic site.
We fix this issue by ensuring we only attempt to add the event if it's a
_settle_ and also actually has an outgoing circuit (which it should
already, just a defensive check).