Commit Graph

8779 Commits

Author SHA1 Message Date
Wilmer Paulino
3f58c2dea4
channeldb/graph: dedup channel edges returned from ChanUpdatesInHorizon
In this commit, we ensure that we de-duplicate the set of channel edges
returned from ChanUpdatesInHorizon. Other subsystems within lnd use this
method to retrieve and send all the channels with updates within a time
series to network peers. However, since the method looks at the edge
update index, which can include up to two entries per edge for each
policy, it's possible that we'd send channel announcements and updates
twice, causing extra bandwidth.
2018-09-04 18:48:21 -07:00
Wilmer Paulino
d3cf3168d2
channeldb/graph_test: ensure policies for an edge have different
timestamps

In this commit, we ensure policies for edges we create in
TestChanUpdatesInHorizon have different update timestamps. This ensures
that there are two entries per edge in the edge update index. Because of
this, the test will fail because ChanUpdatesInHorizon will return
duplicate channel edges due to looking at all the entries within the
edge update index. This will be addressed in a future commit to allow
the set of tests to pass once again.
2018-09-04 18:36:25 -07:00
Wilmer Paulino
85ea08fd17
channeldb: add migration to properly prune edge update index
In this commit, we introduce a migration to fix some of the recent
issues found w.r.t. the edge update index. The migration attempts to fix
two things:

1) Edge policies include an extra byte at the end due to reading an
extra byte for the node's public key from the serialized node info.

2) Properly prune all stale entries within the edge update index.

As a result of this migration, nodes will have a slightly smaller in
size channeldb. We will also no longer send stale edges to our peers in
response to their gossip queries, which should also fix the fetching
channel announcement for closed channels issue.
2018-09-04 18:33:43 -07:00
Wilmer Paulino
c1633da252
channeldb/graph_test: extend prune edge update index test to update edges
In this commit, we extend TestChannelEdgePruningUpdateIndexDeletion test
to include one more update for each edge. By doing this, we can
correctly determine whether old entries were properly pruned from the
index once a new update has arrived.
2018-09-04 18:33:41 -07:00
Wilmer Paulino
8dec659e10
channeldb/graph_test: properly check entries within edge update index
Due to entries within the edge update index having a nil value, the
tests need to be modified to account for this. Previously, we'd assume
that if we were unable to retrieve a value for a certain key that the
entry was non-existent, which is why the improper pruning bug was not
caught. Instead, we'll assert the number of entries to be the expected
value and populate a lookup map to determine whether the correct entries
exist within it.
2018-09-04 18:33:40 -07:00
Wilmer Paulino
2f22e6c35f
channeldb/graph: properly determine old update timestamp for an edge
In this commit, we fix a lingering issue within the edge update index
where entries were not being properly pruned due to an incorrect
calculation of the offset of an edge's last update time. Since the
offset is being determined from the end to the start, we need to
subtract all the fields after an edge policy's last update time from the
total amount of bytes of the serialized edge policy to determine the
correct offset. This was also slightly off as the edge policy included
an extra byte, which has been fixed in the previous commit.

Instead of continuing the slicing approach however, we'll switch to
deserializing the raw bytes of an edge's policy to ensure this doesn't
happen in the future when/if the serialization methods change or extra
data is included.
2018-09-04 18:33:38 -07:00
Wilmer Paulino
492d581df6
channeldb/graph: fix off-by-one public key slice
In this commit, we fix an off-by-one error when slicing the public key
from the serialized node info byte slice. This would cause us to write
an extra byte to all edge policies. Even though the values were read
correctly, when attempting to calculate the offset of an edge's update
time going backwards, we'd always be incorrect, causing us to not
properly prune the edge update index.
2018-09-04 18:31:38 -07:00
Wilmer Paulino
06344da62e
channeldb/graph: refactor UpdateEdgePolicy to use existing db transaction 2018-09-04 18:31:37 -07:00
Wilmer Paulino
aa3e2b6ba4
channeldb/graph: identify edge chan id on failure 2018-09-04 18:31:36 -07:00
Conner Fromknecht
8e94e55839
discovery/gossiper: require explict gossip syncer init
This commit removes the fallback in fetchGossipSyncer
that creates a gossip syncer if one is not registered
w/in the gossiper. Now that we register gossip syncers
explicitly before reading any gossip query messages,
this should not longer be required. The fallback also
did not honor the cfg.NoChanUpdates flag, which may
have led to inconsistencies between configuration and
actual behavior.
2018-09-04 17:32:25 -07:00
Conner Fromknecht
51090a41b5
peer: log disconnect to info, remove go-errors pkg 2018-09-04 17:28:51 -07:00
Conner Fromknecht
adf6b8619f
peer: dispatch gossip sync in peer start
This commit moves the gossip sync dispatch
such that it is more tightly coupled to the
life cycle of the peer. In testing, I noticed
that the gossip syncer needs to be dispatched
before the first gossip messages come across
the wire.

The prior spawn location in the server happens
after starting all of the peer's goroutines,
which could permit an ordering where the
gossip syncer has not yet been registered.
The new location registers the gossip syncer
within the read handler such that the call is
blocks before any messages are read.
2018-09-04 17:28:50 -07:00
Conner Fromknecht
2415675c3d
server: move gossip dispatch to peer
See next commit msg for more detail.
2018-09-04 17:28:50 -07:00
Olaoluwa Osuntokun
1217992d9d
autopilot: optimize heavy loaded agent by fetching raw bytes for ChannelEdge
In this commit, we implement an optimization to the autopilot agent to
ensure that we don't spin and waste CPU when we either have a large
graph, or a high max channel target for the agent. Before this commit,
each time we went to read the state of a channel from disk, we would
decompress the EC Point each time. However, for the case of the instal
ChannlEdge struct to feed to the agent, we only actually need to obtain
the pubkey, and can save the potentially expensive point decompression
for each directional channel in the graph.
2018-09-04 16:43:07 -07:00
Wilmer Paulino
fd61c3c9b0
config: defer creating the base lnd dir until all flag parsing is done
In this commit, we defer creating the base lnd directory until all flag
parsing is done. We do this as it's possible that the config file
specifies a lnddir, but it isn't actually used as the directory has
already been created.
2018-09-04 16:36:15 -07:00
Wilmer Paulino
58ab6c1912
config: ensure ZMQ options when read from the config file are not equal 2018-09-04 16:36:14 -07:00
Wilmer Paulino
6f0fad7946
discovery/gossiper: check ErrNoGraphEdgesFound for
restransmitStaleChannels

In this commit, we add an additional error check for
ErrNoGraphEdgesFound when restransmitting stale channels during the
gossiper's startup. We do this to prevent benign log messages as we'll
log that we were unable to retransmit stale channels when we didn't have
any channels in our graph to begin with.
2018-09-04 16:22:06 -07:00
Olaoluwa Osuntokun
12aadd4978
Merge pull request #1837 from Roasbeef/btcsuite-dep-catchup-5
build: update to latest versions of neutrino+btcwallet
2018-09-04 15:36:25 -07:00
Conner Fromknecht
21a4e21863
server: stop requesting initial graph sync
In this commit, we remove signaling for initial routing
dumps, which create unnecessary log spam, bandwidth, and
CPU. Now that gossip syncing is in full force, we will
instead opt to use the more efficient querying/set
reconciliation. Other nodes may still request initial
gossip sync from us, and we will respond.
2018-09-04 05:04:21 -07:00
Conner Fromknecht
8d7eb41d48
server: always backoff for unstable peers
This commit modifies the connection peer backoff
logic such that it will always backoff for "unstable"
peers. Unstable in this context is determined by
connections whose duration is shorter than 10
minutes. If a disconnect happens with a peer
whose connection lasts longer than 10 minutes,
we will scale back our stored backoff for that peer.

This resolves an issue that would result in a tight
connection loop with remote peers. This stemmed
from the connection duration being very short,
and always driving the backoff to the default
backoff of 1 second. Short connections like
this are now caught by the stable connection
threshold.

This also modifies the computation on the
backoff relaxation to subtract the connection
duration after applying randomized exponential
backoff, which offers better stability when
the connection duration and backoff are roughly
equal.
2018-09-04 03:40:08 -07:00
Johan T. Halseth
0d4df54118
autopilot/agent: signal balanceUpdates on own channel 2018-09-04 10:18:15 +02:00
Johan T. Halseth
a9a9c9aeb4
autopilot/agent: signal chanPendingOpenUpdates on own channel 2018-09-04 10:17:59 +02:00
Johan T. Halseth
186e6d4da4
autopilot/agent: signal chanOpenFailureUpdates on own channel
We do this to avoid a huge amount of goroutines piling up when autopilot
is trying to open many channels, as they will all block trying to send
the update on the stateUpdates channel. Now we instead send them on a
buffered channel, similar to what is done with the nodeUpdates.
2018-09-04 10:17:33 +02:00
Johan T. Halseth
4a88c61a90
autopilot/agent: signal nodeUpdates on own channel
We do this to avoid a huge amount of goroutines piling up on initial
graph sync, as they will all block trying to send the node update on the
stateUpdates channel. Now we instead make a new buffered channel
nodeUpdates, and just return immediately if there is already a signal in
the channel waiting to be processed.
2018-09-04 10:17:09 +02:00
Johan T. Halseth
3e992f094d
autopilot/agent: move signal processing out of select 2018-09-04 10:06:15 +02:00
Conner Fromknecht
d706e40ff7
contractcourt/channel_arbitrator: handle onchain close race on restart 2018-09-03 23:12:57 -07:00
Conner Fromknecht
9aaf046481
Makefile: add tags env argument to build and install 2018-09-03 20:15:18 -07:00
Conner Fromknecht
89113654fe
routing/conf: add experimental assume valid conf 2018-09-03 20:15:18 -07:00
Conner Fromknecht
af0265f8fa
config: add experimental assumechanvalid flag 2018-09-03 20:15:18 -07:00
Conner Fromknecht
f33cfdaa07
server: pass AssumeChannelValid to router 2018-09-03 20:15:18 -07:00
Conner Fromknecht
1e473b2364
routing/router: add assume chan valid 2018-09-03 20:15:12 -07:00
Olaoluwa Osuntokun
4f43c1c943
Merge pull request #1811 from Roasbeef/autopilot-cpu-usage-fix
autopilot: modify the graph interface to return raw bytes for node pubkeys, not entire key
2018-09-03 19:56:51 -07:00
Olaoluwa Osuntokun
6aebd053a3
Merge pull request #1823 from cfromknecht/dont-dc-on-link-fail
peer: ensure link failures are processed in peer life cycle
2018-09-03 19:56:19 -07:00
Olaoluwa Osuntokun
450cabf0d8
Merge pull request #1822 from Roasbeef/chan-update-compat
lnwire: add new compatibility parsing more for onion error chan updates
2018-09-03 19:24:40 -07:00
Olaoluwa Osuntokun
8d02d74e0f
Merge pull request #1810 from wpaulino/switch-no-links-found
htlcswitch+server: avoid logging error if no links are found within peerTerminationWatcher
2018-09-03 19:18:33 -07:00
Olaoluwa Osuntokun
99a5fd9672
Merge pull request #1809 from wpaulino/autopilot-balance-update
autopilot/agent: use updateBalance rather than tracking balance explicitly
2018-09-03 19:18:03 -07:00
Olaoluwa Osuntokun
32b0f3ff95
Merge pull request #1770 from cfromknecht/prevent-goroutine-fail
lnd_test: Prevent calling Fatal in goroutine
2018-09-03 19:17:00 -07:00
Olaoluwa Osuntokun
07247720d4
build: update to latest versions of neutrino+btcwallet
In this commit, we update to the latest versions of btcwallet+neutrino
that fix a number of bugs within lnd itself. Namely, we ensure that we
no longer print out garbage bytes, properly reconnect btcd after being
disconnected, ensure we don't add duplicate utxos, and finally ensure
that we always start the rescan from the wallet's initial birthday.

Fixes #1775.
Fixes #1494.
Fixes #444.
2018-09-03 18:24:00 -07:00
Conner Fromknecht
09992f3fb0
server: remove unused lightningID field 2018-09-03 18:11:25 -07:00
Conner Fromknecht
9c35528fce
server: attempt reconnection to all known addresses 2018-09-03 18:11:21 -07:00
Olaoluwa Osuntokun
edf304ad8b
peer: ensure we unlock the msgCond during peer msgConsumer exit 2018-09-03 17:03:05 -07:00
Olaoluwa Osuntokun
2b448be048
Merge pull request #1812 from cfromknecht/wait-predicate-pending-channels
lnd_test: add wait predicates to pending channel checks
2018-08-31 19:58:39 -07:00
Conner Fromknecht
95a98d86f6
lnd_test: add wait predicates to pending channel checks 2018-08-31 18:34:33 -07:00
Olaoluwa Osuntokun
31e92c4ff2
lnwire: add new compatibility parsing more for onion error chan updates
In this commit, we add a compatibility mode for older version of
clightning to ensure that we're able to properly parse all their channel
updates. An older version of c-lightning would send out encapsulated
onion error message with an additional type byte. This would throw off
our parsing as we didn't expect the type byte, and so we always 2 bytes
off. In order to ensure that we're able to parse these messages and make
adjustments to our path finding, we'll first check to see if the type
byte is there, if so, then we'll snip off two bytes from the front and
continue with parsing. if the bytes aren't found, then we can proceed as
normal and parse the request.
2018-08-31 17:24:26 -07:00
Conner Fromknecht
c569c40cef
lnd_test: prevent calling Fatal in goroutine
This commit prevents an error that I've seen on travis,
wherein the test fails because a call to Fatal happens
after the test finishes. The root cause is that we call
Fatal in a goroutine that is reading from the subscribe
graph rpc call.

To fix this, we now pass an err chan back into the main
test context, where we can receive any errors and fail
the test if one comes through.
2018-08-31 17:11:17 -07:00
Valentine Wallace
98d1482942
chainntnfs/interface_test: fix unreliable historical block ntfns test
After joining the two forked chains, it is necessary to ensure they both agree on the same best hash before proceeding to UnsafeStart the notifier.
This is because when the BitcoindClient starts, it retrieves its best known block then calls GetBlockHeaderVerbose on the hash of the retrieved block. This block could be a reorged block if JoinNodes has not completed sync. If it is the case that the best block retrieved has been reorged out of the chain, GetBlockHeaderVerbose errors because bitcoind sets the number of confirmations to -1 on reorged blocks, and the btcd rpc client panics when parsing a block whose number of confirmations is negative.

This parsing error is expected to be fixed, and as a more permanent solution chain backends should ensure that the `best block` they retrieve during startup has not been reorged out of the chain.
2018-08-31 16:18:25 -07:00
Conner Fromknecht
48dc38d9f9
peer: ensure link failures are processed in peer life cycle 2018-08-30 17:36:13 -07:00
Olaoluwa Osuntokun
8f843c5eaa
discovery: update autopilot.Node usage to match recent API changes 2018-08-29 15:45:39 -07:00
Olaoluwa Osuntokun
a429c56c10
autopilot: update the Node interface to return a raw bytes, not the key
In this commit, we modify the Node interface to return a set of raw
bytes, rather than the full pubkey struct. We do this as within the
package, commonly we only require the pubkey bytes for fingerprinting
purposes. Before this commit, we were forced to _always_ decompress the
pubkey which can be expensive done thousands of times a second.
2018-08-29 15:44:47 -07:00
Wilmer Paulino
0a2986355e
pilot: improve error when unable to reach any of a peer's addresses 2018-08-29 02:06:03 -07:00