In this commit, we implement an optimization to the autopilot agent to
ensure that we don't spin and waste CPU when we either have a large
graph, or a high max channel target for the agent. Before this commit,
each time we went to read the state of a channel from disk, we would
decompress the EC Point each time. However, for the case of the instal
ChannlEdge struct to feed to the agent, we only actually need to obtain
the pubkey, and can save the potentially expensive point decompression
for each directional channel in the graph.
In this commit, we defer creating the base lnd directory until all flag
parsing is done. We do this as it's possible that the config file
specifies a lnddir, but it isn't actually used as the directory has
already been created.
In this commit, we remove signaling for initial routing
dumps, which create unnecessary log spam, bandwidth, and
CPU. Now that gossip syncing is in full force, we will
instead opt to use the more efficient querying/set
reconciliation. Other nodes may still request initial
gossip sync from us, and we will respond.
This commit modifies the connection peer backoff
logic such that it will always backoff for "unstable"
peers. Unstable in this context is determined by
connections whose duration is shorter than 10
minutes. If a disconnect happens with a peer
whose connection lasts longer than 10 minutes,
we will scale back our stored backoff for that peer.
This resolves an issue that would result in a tight
connection loop with remote peers. This stemmed
from the connection duration being very short,
and always driving the backoff to the default
backoff of 1 second. Short connections like
this are now caught by the stable connection
threshold.
This also modifies the computation on the
backoff relaxation to subtract the connection
duration after applying randomized exponential
backoff, which offers better stability when
the connection duration and backoff are roughly
equal.
We do this to avoid a huge amount of goroutines piling up when autopilot
is trying to open many channels, as they will all block trying to send
the update on the stateUpdates channel. Now we instead send them on a
buffered channel, similar to what is done with the nodeUpdates.
We do this to avoid a huge amount of goroutines piling up on initial
graph sync, as they will all block trying to send the node update on the
stateUpdates channel. Now we instead make a new buffered channel
nodeUpdates, and just return immediately if there is already a signal in
the channel waiting to be processed.
In this commit, we update to the latest versions of btcwallet+neutrino
that fix a number of bugs within lnd itself. Namely, we ensure that we
no longer print out garbage bytes, properly reconnect btcd after being
disconnected, ensure we don't add duplicate utxos, and finally ensure
that we always start the rescan from the wallet's initial birthday.
Fixes#1775.
Fixes#1494.
Fixes#444.
In this commit, we add a compatibility mode for older version of
clightning to ensure that we're able to properly parse all their channel
updates. An older version of c-lightning would send out encapsulated
onion error message with an additional type byte. This would throw off
our parsing as we didn't expect the type byte, and so we always 2 bytes
off. In order to ensure that we're able to parse these messages and make
adjustments to our path finding, we'll first check to see if the type
byte is there, if so, then we'll snip off two bytes from the front and
continue with parsing. if the bytes aren't found, then we can proceed as
normal and parse the request.
This commit prevents an error that I've seen on travis,
wherein the test fails because a call to Fatal happens
after the test finishes. The root cause is that we call
Fatal in a goroutine that is reading from the subscribe
graph rpc call.
To fix this, we now pass an err chan back into the main
test context, where we can receive any errors and fail
the test if one comes through.
In this commit, we modify the Node interface to return a set of raw
bytes, rather than the full pubkey struct. We do this as within the
package, commonly we only require the pubkey bytes for fingerprinting
purposes. Before this commit, we were forced to _always_ decompress the
pubkey which can be expensive done thousands of times a second.
In this commit, we remove the disconnection logic within the
chanController when failing to open a channel with a peer. We do this as
it's already done within the autopilot agent, where it should be, and
because it's possible that we were already connected to this node and we
happened to disconnect them anyway.
In this commit, we modify the balanceUpdate autopilot signal to update
the balance according to what's returned to the WalletBalance callback
rather than explicitly tracking the balance. This gives the agent a
better sense of what the wallet's balance actually is.
In this commit, we avoid logging an error when the links associated with
a peer are not found within its termination watcher. We do this to
prevent a benign log message as the links have already been removed from
the switch.
In this commit, we aim to resolve an issue with nodes requesting for
channel announcements when receiving a channel update for a channel
they're not aware of. This can happen if a node is not caught up with
the chain or if they receive updates for zombie channels. This would
lead to a spam issue, as if a node is not caught up with the chain,
every new update they receive is premature, causing them to manually
request the backing channel announcement. Ideally, we should be able to
detect this as a potential DoS vector and ban the node responsible, but
for now we'll simply remove this functionality.
In this commit, we add a quit channel to the AddMsg method of the
msgStream struct. Before this commit, if the queue was full, the
readHandler would block and be unable to exit. We remedy this by
leveraging the existing quit channel of the peer as an additional select
case within the AddMsg method.
This commit fixes a bug in the integration test, that
reliably fails after disabling the height hint cache.
The test originally asserted that the htlc funds were
in limbo, but was reading a stale copy of the force
close information. Recently, the test was amended to
provided a valid read of the force close in
96a079873a6da2f0a93adc30945971fc07c5e610. However,
the issue was not apparent until build against the
disabled height hint cache.
The test is now correct to assert that there are no
funds in limbo, as the commitment output has been
swept, but the htlcs are still in the contract
court, so the nursery is unaware of them. We also
add another sanity check to validate that there are
no pending htlcs on the force close at that point
in time.