This commit adds a test ensuring that the fix applied in the previous
commit works as expected. The test exercises the scenario where the
HTLCs on the local, remote and pending remote commitment differ, and we
attempt to restore the update logs. We now check that in this case the
logs before and after restart are equivalent.
remoteUpdateLog from localCommit
This commit fixes a bug within channel.go that would lead to the
content of the update logs and their indexes getting out of sync during
restores.
The scenario that could occur was that the localUpdateLog was initiated
with a log index taken from the localCommitment. Updates we send (which
are added to the localUpdateLog) will be added to the remote commitment
first. The problem happened when an update was sent and added to the
remote commitment, but not ACKed. Since it was not ACKed, we would not
add it to our local commitment. During a restart/restore we would init
the localUpdateLog with a height too low, such that when going through
the outgoing HTLCs on the remote commitment, we would restore an HTLC at
an index higher than our local log HTLC counter.
The symmetric change is done to the remoteUpdateLog.
In this commit, we introduce a new method to the channel router's config
struct: QueryBandwidth. This method allows the channel router to query
for the up-to-date available bandwidth of a particular link. In the case
that this link emanates from/to us, then we can query the switch to see
if the link is active (if not bandwidth is zero), and return the current
best estimate for the available bandwidth of the link. If the link,
isn't one of ours, then we can thread through the total maximal
capacity of the link.
In order to implement this, the missionControl struct will now query the
switch upon creation to obtain a fresh bandwidth snapshot. We take care
to do this in a distinct db transaction in order to now introduced a
circular waiting condition between the mutexes in bolt, and the channel
state machine.
The aim of this change is to reduce the number of unnecessary failures
during HTLC payment routing as we'll now skip any links that are
inactive, or just don't have enough bandwidth for the payment. Nodes
that have several hundred channels (all of which in various states of
activity and available bandwidth) should see a nice gain from this w.r.t
payment latency.
In this commit, we fix an issue where sometimes transactions wouldn't
provide enough of a fee to be relayed by the backend node. This would
especially cause issues when sweeping outputs after a contract breach,
etc.
Now, we'll fetch the minimum relay fee from the backend node and ensure
it is set as a lower bound if the estimated fee ends up dipping below
it.
This commit adds a test for the case where the ChannelArbitrator fails
to broadcast its commitment during a force close because of
ErrDoubleSpend. We test that in this case it will still wait for a
commitment getting confirmed in-chain, then resolve.
In this commit, we address a timeout issue on our Travis builds within
the update channel policy integration test. Before asserting the policy
for the channel between Alice and Bob, we need to make sure we receive
the updates responsible for modifying this policy first. At times, the
update wasn't received in time before checking the policy, which led to
us checking the old policy. We fix this by explicitly making sure we
wait for the updates first.
This commit attempts to fix a flake we have seen, where a routing hints
for an inactive channel was unexpectedly added to the invoice. This
happened because of a race between shutting down the endpoint of this
channel (Eve) and creating the invoice. This commit fixes this by moving
the call to AddInvoice with corresponding route hint check inside a
WaitPredicate.
This commit makes sure old log files are deleted before running the
integration tests. Previously new logs would be added to the existing
ones, causing a big mess. This was especially messy during flake
hunting, as all logs would be kept.
In addition, we now print the date before every start of the integration
tests, which is useful to see how long they have been running, for
instance in case of a deadlock.
This commit adds a test where we trigger a situation which would
previously make the link think it was never in sync, and potentially
create a lot of empty state updates. This would happen if we were
waiting for a revocation, while still receiving updates from the remote.
Since in this case we could not ACK the updates because of the exhausted
revocation window, our local commitchain would extend, while the remote
chain would stall. When we finally got the revocation the local
commitment height would be far larger than the remote, and FullySynced
would return false from that point on.
This commit adds a new test that makes sure we don't try to send
commitments for states where there are now new updates. Before the
recent change to FullySynced we would do this in this test scenario, as
the local an remote commitment heights would differ.
The test makes the local commitment chain extend by 1 vs the remote,
which would earlier trigger another state update, and checks taht no
such state update is made.
This commit removes a faulty check we did to determine if the channel
commitments were fully synced. We assumed that if out local commitment
chain had a height higher than the remote, then we would have state
updates present in our chain but not in theirs, and owed a commitment.
However, there were cases where this wasn't true, and we would send a
new commitment even though we had no new updates to sign. This is a
protocol violation.
Now we don't longer check the heights to determine if we are fully
synced. A consequence of this is that we also need to check if we have
any pending fee updates that are nopt yet signed, as those are
considered non-empty updates.
This commit alters the neutrino chainview such that it
caches the filter entries corresponding to watched
outpoints at the moment they are added to the filter.
Previously, we would rederive each filter entry when
reconstructing the relevant filter entries, which
would lead to unnecessary work on the gc. Now, each is
created at most once, and reused across subsequent
reconstructions.
This commit makes the call to forwardBatch after locking
in Adds synchronous. This ensures that circuits for any Add
packets are added to the switch in the same order that they
are prescribed in the channel state. Though it is very unlikely
this case would arise, it may happen under more greater loads.
In addition, this also makes some trivial optimizations wrt. to
not spawning unnecessary goroutines if no settle/fail packets
are locked in.
This commit adds a simple scheduling mechanism for
resolving potential deadlocks when dropping a stale
connection (via pubkey inspection).
Ideally, we'd like to wait to activate a new peer until
the previous one has exited entirely. However, the current
logic attempts to disconnect (and wait) until the peer
has been cleaned up fully, which can result in
deadlocks with other portions of the codebase, since
other blocking methods may also need acquire the mutex
before the peer can exit.
When existing connections are replaced, they now
schedule a callback that is executed inside the
peerTerminationWatcher. Since the peer now waits for
the clean exit of the prior peer, this callback is
now executed with a clean slate, adds the peer to
the server's maps, and initiates peer's Start() method.