In this commit, we fix an existing bug that could cause lnd to crash if
we sent a payment, and the *destination* sent a temp channel failure
error message. When handling such a message, we’ll look in the nextHop
map to see which channel was *after* the node that sent the payment.
However, if the destination sends this error, then there’ll be no entry
in this map.
To address this case, we now add a prevHop map. If we attempt to lookup
a node in the nextHop map, and they don’t have an entry, then we’ll
consult the prevHop map.
We also update the set of tests to ensure that we’re properly setting
both the prevHop map and the nextHop map.
This commit adds an additional check in GetUtxo that
tests for the nil-ness of the spend report returned by
the neutrino backend. Previously, a nil error and
spend report could be returned if the rescan did not
find the output at or above the start height. This
was observed to have cause a nil pointer dereference
when the returning line attempted to access the output.
This case is now handled by returning a distinct error
signaling that the output was not found.
In this commit we fix a newly introduce bug wherein we would close the
transaction subscription twice on shutdown. This would lead to a
shutdown, but an unclean one as it would panic due to closing a channel
twice.
We fix this my removing a defer statement such that, we’ll only cancel
the subscription once.
This commit adds synchronization around the processing
of multiple ChannelEdgePolicy updates for the same
channel ID at the same time.
This fixes a bug that could cause the database access
HasChannelEdge to be out of date when the goroutine
came to the point where it was calling UpdateEdgePolicy.
This happened because a second goroutine would have
called UpdateEdgePolicy in the meantime.
This bug was quite benign, as if this happened at
runtime, we would eventually get the ChannelEdgePolicy
we had lost again, either from a peer sending it to
us, or if we would fail a payment since we were using
outdated information. However, it would cause some of
the tests to flake, since losing routing information
made payments we expected to go through fail if this
happened.
This is fixed by introducing a new mutex type, that
when locking and unlocking takes an additional
(id uint64) parameter, keeping an internal map
tracking what ID's are currently locked and the
count of goroutines waiting for the mutex. This
ensure we can still process updates concurrently,
only avoiding updates with the same channel ID from
being run concurrently.
This commit makes the gossiper aware of the timestamps
of ChannelUpdates and NodeAnnouncements, such that it
only keeps the newest message when deduping. Earlier
it would always keep the last received message, which
in some cases could be outdated.
This commit adds a line of text including a test case's
name to Alice's and Bob's logfiles during integration
tests, making it easier to seek for the place in the log
where the specific tests start.
This commit fixes a nasty bug that has been lingering within lnd, and
has been noticed due to the added retransmission logic. Before this
commit, upon a restart, if we had an active HTLC and received a new
commitment update, then we would re-forward ALL active HTLC’s. This
could at times lead to a nasty cycle:
* We re-forward an HTLC already processed.
* We then notice that the time-lock is out of date (retransmitted
HTLC), so we go to fail it.
* This is detected as a replay attack, so we send an
UpdateMalformedHTLC
* This second failure ends up creating a nil entry in the log,
leading to a panic.
* Remote party disconnects.
* Upon reconnect we send again as we need to retransmit the changes,
this goes on forever.
In order to fix this, we now ensure that we only forward HTLC’s that
have been newly locked in at this next state. With this, we now avoid
the loop described above, and also ensure that we don’t accidentally
attempt an HTLC replay attack on our selves.
Fixes#528.
Fixes#545.
In this commit, we modify the pruning semantics of the missionControl
struct. Before this commit, on each payment attempt, we would fetch a
new graph pruned view each time. This served to instantly propagate any
detected failures to all outstanding payment attempts. However, this
meant that we could at times get stuck in a retry loop if sends take a
few second, then we may prune an edge, try another, then the original
edge is now unpruned.
To remedy this, we now introduce the concept of a paymentSession. The
session will start out as a snapshot of the latest graph prune view.
Any payment failures are now reported directly to the paymentSession
rather than missionControl. The rationale for this is that
edges/vertexes pruned as result of failures will never decay for a
local payment session, only for the global prune view. With this in
place, we ensure that our set of prune view only grows for a session.
Fixes#536.
Before this commit, if the htlcManager unexpectedly exited (due to a
protocol error, etc), the underlying block epoch notification intent
that was created for it would never be cancelled. This would result in
tens, or hundreds of goroutine leaks as the client would never consume
those notifications.
To fix this, we move cancellation of the block epoch intent from the
Stop() method of the channel link, to the defer statement at the top of
the htlcManager.
In this commit, we add an additional case when handling a failed
commitment signature. If we detect that it’s a InvalidCommitSigError,
then we’ll send over an lnwire.Error message with the full details. We
don’t yet properly dispatch this error on the reciting side, but that
will be done in a follow up a commit.
In this commit, we add a new detailed error that’s to be returned
when/if the remote peer sends us an invalid commit signature. The new
error contains the transaction that we attempted to validate the
signature over, the sighs, and the state number. Returning this
additional information will serve to aide in debugging any
cross-implementation issues.
In this commit, we modify the logic within the Stop() method for
msgStream to ensure that the main goroutine properly exits. It has been
observed on running nodes with tens of connections, that if a node is
very flappy, then the node can end up with hundreds of leaked
goroutines.
In order to fix this, we’ll continually signal the msgConsumer to wake
up after the quit channel has been closed. We do this until the
msgConsumer sets a bool indicating that it has exited atomically.
In this commit, we fix a lingering protocol level bug when reporting
errors encountered during onion blob processing. The spec states that
if one sends an UpdateFailMalformedHtlc, then the error reason MUST
have the BadOnion bit set. Before this commit, we would return
CodeTemporaryChannelFailure. This is incorrect as this doesn’t have the
BadOnio bit set.
Prior to this commit, the final close summary we added to the database
for the initiator of the channel was incorrect. This is due to the fact
that before, we would use the final snapshot to determine how many
coins the local party was delivered as a result of the cooperative
closure transaction. This is incorrect, as the local party pays fees on
the closure transaction if it’s the initiator.
To remedy this, we’ll now use the new return value of
CompleteCooperativeClose to properly note our final balance in the
database.
In this commit, add an additional return value to
CompleteCooperativeClose. We’ll now report to the caller our final
balance in the cooperative closure transaction. We report this as
depending on if we’re the initiator or not, our final balance may not
exactly match the balance we had in the last state.
In this commit, we remove the blocks_till_open from
PendingChannelsResponse as in its current state, the values that are
assigned to this field don’t accurately reflect the naming. This has
caused a good bit of confusion amongst users lately. As a result, we’re
temporarily removing this field until we have proper incremental
notifications within the chain notifier.
In this commit, we add a new runtime assertion to ensure that the
backed btcd node (if this mode is active) has the proper indexes set
up. Atm, if btcd isn’t running with the txindex active, then the
current ChainNotifier implementation will be unable to properly handle
certain classes of historical notification dispatches.
In order to test that the running btcd node is configured properly,
we’ll fetch the latest block, then try to query a transaction within
that block using the txindex. If btcd isn’t running with this mode
active, then the request will fail. In this case, we’ll then fail to
start lnd with an error.
Fixes#525.
This commit adds a new test, that in a small network
of 4 nodes, tests that a private channel can be used
for routing payments by the endpoints of the channel,
while the existence of the channel is not known to
the rest of the network.
Peers are treated as transient by default. When a peer is disconnected,
no attempt is made to reconnect. However, if we have a channel open
with a peer that peer will be added as persistent. If a persistent peer
becomes disconnected then we will attempt to reconnect.
This behavior implies that a fresh node - those without any channels -
will fall off the network over time as remote nodes restart or due to
connectivity changes. This change marks bootstrap peers as persistent
and ensures that the node remains connected to those specific peers over
time. This does not keep the node connected in the case that all
bootstrap peers are down.
Fixes#451.