This commit moves the gossip sync dispatch
such that it is more tightly coupled to the
life cycle of the peer. In testing, I noticed
that the gossip syncer needs to be dispatched
before the first gossip messages come across
the wire.
The prior spawn location in the server happens
after starting all of the peer's goroutines,
which could permit an ordering where the
gossip syncer has not yet been registered.
The new location registers the gossip syncer
within the read handler such that the call is
blocks before any messages are read.
In this commit, we implement an optimization to the autopilot agent to
ensure that we don't spin and waste CPU when we either have a large
graph, or a high max channel target for the agent. Before this commit,
each time we went to read the state of a channel from disk, we would
decompress the EC Point each time. However, for the case of the instal
ChannlEdge struct to feed to the agent, we only actually need to obtain
the pubkey, and can save the potentially expensive point decompression
for each directional channel in the graph.
In this commit, we defer creating the base lnd directory until all flag
parsing is done. We do this as it's possible that the config file
specifies a lnddir, but it isn't actually used as the directory has
already been created.
restransmitStaleChannels
In this commit, we add an additional error check for
ErrNoGraphEdgesFound when restransmitting stale channels during the
gossiper's startup. We do this to prevent benign log messages as we'll
log that we were unable to retransmit stale channels when we didn't have
any channels in our graph to begin with.
In this commit, we remove signaling for initial routing
dumps, which create unnecessary log spam, bandwidth, and
CPU. Now that gossip syncing is in full force, we will
instead opt to use the more efficient querying/set
reconciliation. Other nodes may still request initial
gossip sync from us, and we will respond.
This commit modifies the connection peer backoff
logic such that it will always backoff for "unstable"
peers. Unstable in this context is determined by
connections whose duration is shorter than 10
minutes. If a disconnect happens with a peer
whose connection lasts longer than 10 minutes,
we will scale back our stored backoff for that peer.
This resolves an issue that would result in a tight
connection loop with remote peers. This stemmed
from the connection duration being very short,
and always driving the backoff to the default
backoff of 1 second. Short connections like
this are now caught by the stable connection
threshold.
This also modifies the computation on the
backoff relaxation to subtract the connection
duration after applying randomized exponential
backoff, which offers better stability when
the connection duration and backoff are roughly
equal.
We do this to avoid a huge amount of goroutines piling up when autopilot
is trying to open many channels, as they will all block trying to send
the update on the stateUpdates channel. Now we instead send them on a
buffered channel, similar to what is done with the nodeUpdates.
We do this to avoid a huge amount of goroutines piling up on initial
graph sync, as they will all block trying to send the node update on the
stateUpdates channel. Now we instead make a new buffered channel
nodeUpdates, and just return immediately if there is already a signal in
the channel waiting to be processed.
In this commit, we update to the latest versions of btcwallet+neutrino
that fix a number of bugs within lnd itself. Namely, we ensure that we
no longer print out garbage bytes, properly reconnect btcd after being
disconnected, ensure we don't add duplicate utxos, and finally ensure
that we always start the rescan from the wallet's initial birthday.
Fixes#1775.
Fixes#1494.
Fixes#444.
In this commit, we add a compatibility mode for older version of
clightning to ensure that we're able to properly parse all their channel
updates. An older version of c-lightning would send out encapsulated
onion error message with an additional type byte. This would throw off
our parsing as we didn't expect the type byte, and so we always 2 bytes
off. In order to ensure that we're able to parse these messages and make
adjustments to our path finding, we'll first check to see if the type
byte is there, if so, then we'll snip off two bytes from the front and
continue with parsing. if the bytes aren't found, then we can proceed as
normal and parse the request.
This commit prevents an error that I've seen on travis,
wherein the test fails because a call to Fatal happens
after the test finishes. The root cause is that we call
Fatal in a goroutine that is reading from the subscribe
graph rpc call.
To fix this, we now pass an err chan back into the main
test context, where we can receive any errors and fail
the test if one comes through.
After joining the two forked chains, it is necessary to ensure they both agree on the same best hash before proceeding to UnsafeStart the notifier.
This is because when the BitcoindClient starts, it retrieves its best known block then calls GetBlockHeaderVerbose on the hash of the retrieved block. This block could be a reorged block if JoinNodes has not completed sync. If it is the case that the best block retrieved has been reorged out of the chain, GetBlockHeaderVerbose errors because bitcoind sets the number of confirmations to -1 on reorged blocks, and the btcd rpc client panics when parsing a block whose number of confirmations is negative.
This parsing error is expected to be fixed, and as a more permanent solution chain backends should ensure that the `best block` they retrieve during startup has not been reorged out of the chain.
In this commit, we modify the Node interface to return a set of raw
bytes, rather than the full pubkey struct. We do this as within the
package, commonly we only require the pubkey bytes for fingerprinting
purposes. Before this commit, we were forced to _always_ decompress the
pubkey which can be expensive done thousands of times a second.
In this commit, we remove the disconnection logic within the
chanController when failing to open a channel with a peer. We do this as
it's already done within the autopilot agent, where it should be, and
because it's possible that we were already connected to this node and we
happened to disconnect them anyway.
In this commit, we modify the balanceUpdate autopilot signal to update
the balance according to what's returned to the WalletBalance callback
rather than explicitly tracking the balance. This gives the agent a
better sense of what the wallet's balance actually is.
In this commit, we avoid logging an error when the links associated with
a peer are not found within its termination watcher. We do this to
prevent a benign log message as the links have already been removed from
the switch.
In this commit, we aim to resolve an issue with nodes requesting for
channel announcements when receiving a channel update for a channel
they're not aware of. This can happen if a node is not caught up with
the chain or if they receive updates for zombie channels. This would
lead to a spam issue, as if a node is not caught up with the chain,
every new update they receive is premature, causing them to manually
request the backing channel announcement. Ideally, we should be able to
detect this as a potential DoS vector and ban the node responsible, but
for now we'll simply remove this functionality.