In this commit, we add an additional degree of isolation to the set of
integration tests. A bug was recently fixed to ensure that the wallet
always starts rescans from _after_ it's birthday. In the past it would
miss some funds that were deposited _right_ before the birthday of the
wallet. Fixing this bug exposed a test flake wherein the btcd node would
itself rescan back and collect some of the funds that were last sent to
the bitcoind node.
In order to fix this, we now ensure that each backend will use a unique
HD seed such that the tests are still deterministic for each backend and
role.
In this commit, we ensure that we de-duplicate the set of channel edges
returned from ChanUpdatesInHorizon. Other subsystems within lnd use this
method to retrieve and send all the channels with updates within a time
series to network peers. However, since the method looks at the edge
update index, which can include up to two entries per edge for each
policy, it's possible that we'd send channel announcements and updates
twice, causing extra bandwidth.
timestamps
In this commit, we ensure policies for edges we create in
TestChanUpdatesInHorizon have different update timestamps. This ensures
that there are two entries per edge in the edge update index. Because of
this, the test will fail because ChanUpdatesInHorizon will return
duplicate channel edges due to looking at all the entries within the
edge update index. This will be addressed in a future commit to allow
the set of tests to pass once again.
In this commit, we introduce a migration to fix some of the recent
issues found w.r.t. the edge update index. The migration attempts to fix
two things:
1) Edge policies include an extra byte at the end due to reading an
extra byte for the node's public key from the serialized node info.
2) Properly prune all stale entries within the edge update index.
As a result of this migration, nodes will have a slightly smaller in
size channeldb. We will also no longer send stale edges to our peers in
response to their gossip queries, which should also fix the fetching
channel announcement for closed channels issue.
In this commit, we extend TestChannelEdgePruningUpdateIndexDeletion test
to include one more update for each edge. By doing this, we can
correctly determine whether old entries were properly pruned from the
index once a new update has arrived.
Due to entries within the edge update index having a nil value, the
tests need to be modified to account for this. Previously, we'd assume
that if we were unable to retrieve a value for a certain key that the
entry was non-existent, which is why the improper pruning bug was not
caught. Instead, we'll assert the number of entries to be the expected
value and populate a lookup map to determine whether the correct entries
exist within it.
In this commit, we fix a lingering issue within the edge update index
where entries were not being properly pruned due to an incorrect
calculation of the offset of an edge's last update time. Since the
offset is being determined from the end to the start, we need to
subtract all the fields after an edge policy's last update time from the
total amount of bytes of the serialized edge policy to determine the
correct offset. This was also slightly off as the edge policy included
an extra byte, which has been fixed in the previous commit.
Instead of continuing the slicing approach however, we'll switch to
deserializing the raw bytes of an edge's policy to ensure this doesn't
happen in the future when/if the serialization methods change or extra
data is included.
In this commit, we fix an off-by-one error when slicing the public key
from the serialized node info byte slice. This would cause us to write
an extra byte to all edge policies. Even though the values were read
correctly, when attempting to calculate the offset of an edge's update
time going backwards, we'd always be incorrect, causing us to not
properly prune the edge update index.
This commit removes the fallback in fetchGossipSyncer
that creates a gossip syncer if one is not registered
w/in the gossiper. Now that we register gossip syncers
explicitly before reading any gossip query messages,
this should not longer be required. The fallback also
did not honor the cfg.NoChanUpdates flag, which may
have led to inconsistencies between configuration and
actual behavior.
This commit moves the gossip sync dispatch
such that it is more tightly coupled to the
life cycle of the peer. In testing, I noticed
that the gossip syncer needs to be dispatched
before the first gossip messages come across
the wire.
The prior spawn location in the server happens
after starting all of the peer's goroutines,
which could permit an ordering where the
gossip syncer has not yet been registered.
The new location registers the gossip syncer
within the read handler such that the call is
blocks before any messages are read.
In this commit, we implement an optimization to the autopilot agent to
ensure that we don't spin and waste CPU when we either have a large
graph, or a high max channel target for the agent. Before this commit,
each time we went to read the state of a channel from disk, we would
decompress the EC Point each time. However, for the case of the instal
ChannlEdge struct to feed to the agent, we only actually need to obtain
the pubkey, and can save the potentially expensive point decompression
for each directional channel in the graph.
In this commit, we defer creating the base lnd directory until all flag
parsing is done. We do this as it's possible that the config file
specifies a lnddir, but it isn't actually used as the directory has
already been created.
In this commit, we remove signaling for initial routing
dumps, which create unnecessary log spam, bandwidth, and
CPU. Now that gossip syncing is in full force, we will
instead opt to use the more efficient querying/set
reconciliation. Other nodes may still request initial
gossip sync from us, and we will respond.
This commit modifies the connection peer backoff
logic such that it will always backoff for "unstable"
peers. Unstable in this context is determined by
connections whose duration is shorter than 10
minutes. If a disconnect happens with a peer
whose connection lasts longer than 10 minutes,
we will scale back our stored backoff for that peer.
This resolves an issue that would result in a tight
connection loop with remote peers. This stemmed
from the connection duration being very short,
and always driving the backoff to the default
backoff of 1 second. Short connections like
this are now caught by the stable connection
threshold.
This also modifies the computation on the
backoff relaxation to subtract the connection
duration after applying randomized exponential
backoff, which offers better stability when
the connection duration and backoff are roughly
equal.
We do this to avoid a huge amount of goroutines piling up when autopilot
is trying to open many channels, as they will all block trying to send
the update on the stateUpdates channel. Now we instead send them on a
buffered channel, similar to what is done with the nodeUpdates.
We do this to avoid a huge amount of goroutines piling up on initial
graph sync, as they will all block trying to send the node update on the
stateUpdates channel. Now we instead make a new buffered channel
nodeUpdates, and just return immediately if there is already a signal in
the channel waiting to be processed.
In this commit, we update to the latest versions of btcwallet+neutrino
that fix a number of bugs within lnd itself. Namely, we ensure that we
no longer print out garbage bytes, properly reconnect btcd after being
disconnected, ensure we don't add duplicate utxos, and finally ensure
that we always start the rescan from the wallet's initial birthday.
Fixes#1775.
Fixes#1494.
Fixes#444.
In this commit, we add a compatibility mode for older version of
clightning to ensure that we're able to properly parse all their channel
updates. An older version of c-lightning would send out encapsulated
onion error message with an additional type byte. This would throw off
our parsing as we didn't expect the type byte, and so we always 2 bytes
off. In order to ensure that we're able to parse these messages and make
adjustments to our path finding, we'll first check to see if the type
byte is there, if so, then we'll snip off two bytes from the front and
continue with parsing. if the bytes aren't found, then we can proceed as
normal and parse the request.
This commit prevents an error that I've seen on travis,
wherein the test fails because a call to Fatal happens
after the test finishes. The root cause is that we call
Fatal in a goroutine that is reading from the subscribe
graph rpc call.
To fix this, we now pass an err chan back into the main
test context, where we can receive any errors and fail
the test if one comes through.
After joining the two forked chains, it is necessary to ensure they both agree on the same best hash before proceeding to UnsafeStart the notifier.
This is because when the BitcoindClient starts, it retrieves its best known block then calls GetBlockHeaderVerbose on the hash of the retrieved block. This block could be a reorged block if JoinNodes has not completed sync. If it is the case that the best block retrieved has been reorged out of the chain, GetBlockHeaderVerbose errors because bitcoind sets the number of confirmations to -1 on reorged blocks, and the btcd rpc client panics when parsing a block whose number of confirmations is negative.
This parsing error is expected to be fixed, and as a more permanent solution chain backends should ensure that the `best block` they retrieve during startup has not been reorged out of the chain.