Commit Graph

31 Commits

Author SHA1 Message Date
Conner Fromknecht
58e924ad1c
discovery: don't historical sync when NumActiveSyncers == 0
Currently when numgraphsyncpeers=0, lnd will still attempt to perform
an initial historical sync. We change this behavior here to forgoe
historical sync entirely when numgraphsyncpeers is zero, since the
routing table isn't being updated anyway while the node is active.

This permits a no-graph lnd mode where no syncing occurs at all.
2021-02-10 09:35:45 -08:00
Conner Fromknecht
b1fee734ec
discovery/sync_manager: remove unneeded markGraphSyncing
AFAICT it's not possible to flip back from bein synced_to_chain, so we
remove the underlying call that could reflect this. The method is moved
into the test file since it's still used to test correctness of other
portions of the flow.
2021-01-29 00:19:48 -08:00
Conner Fromknecht
e42301dee2
lntest: call markGraphSynced from gossipSyncer
Rather than performing this call in the SyncManager, we give each
gossipSyncer the ability to mark the first sync completed. This permits
pinned syncers to contribute towards the rpc-level synced_to_graph
value, allowing the value to be true after the first pinned syncer or
regular syncer complets. Unlinke regular syncers, pinned syncers can
proceed in parallel possibly decreasing the waiting time if consumers
rely on this field before proceeding to load their application.
2021-01-29 00:19:48 -08:00
Conner Fromknecht
340414356d
discovery: perform initial historical sync for pinned peers 2021-01-29 00:19:47 -08:00
Conner Fromknecht
2f0d56d539
discovery: add support for PinnedSyncers
A pinned syncer is an ActiveSyncer that is configured to always remain
active for the lifetime of the connection. Pinned syncers do not count
towards the total NumActiveSyncer count, which are rotated periodically.

This features allows nodes to more tightly synchronize their routing
tables by ensuring they are always receiving gossip from distinguished
subset of peers.
2021-01-29 00:19:47 -08:00
Conner Fromknecht
9e932f2a64
discovery/sync_manager: Pause/Resume HistoricalSyncTicker
This gives each initial historical syncer an equal amount of time before
being rotated, even if some fail.
2021-01-29 00:19:47 -08:00
Conner Fromknecht
ef0cd82c1f
discovery/sync_manager: make setHistoricalSyncer closure 2021-01-29 00:19:46 -08:00
Conner Fromknecht
72fbd1283b
discovery/sync_manager: break out IsGraphSynced check 2021-01-29 00:19:46 -08:00
Wilmer Paulino
a4f33ae63c
discovery: adhere to proper channel chunk splitting for ReplyChannelRange 2020-12-08 15:18:07 -08:00
Wilmer Paulino
c5fc7334a4
discovery: limit NumBlocks to best known height for outgoing QueryChannelRange
This is done to ensure we don't receive replies for channels in blocks
not currently known to us, which we wouldn't be able to process.
2020-12-08 15:18:06 -08:00
Joost Jager
3d7de2ad39
multi: remove dead code 2019-09-10 17:21:59 +02:00
Wilmer Paulino
c405e89197
discovery: check non-nil syncer upon historical sync tick 2019-08-13 18:23:05 -07:00
Wilmer Paulino
977c139f3c
discovery: handle graph synced status after stalled initial historical sync
This ensures that the graph synced status is marked true at some point
once a historical sync has completed. Before this commit, a stalled
historical sync could cause us to never mark the graph as synced.
2019-08-06 17:56:55 -07:00
Wilmer Paulino
af4234f680
discovery: allow the SyncManager to report whether the graph is synced 2019-08-06 17:56:54 -07:00
Conner Fromknecht
a3e690e253
discovery/sync_manager: init all syncers with IgnoreHistoricalFilters 2019-07-30 17:25:31 -07:00
Johan T. Halseth
526486ae24
discovery/sync_manager: restart historical sync on first connected peer
To handle the case where we have been without peers, and get a new
connection, we reset the historical scan booleans when the first active
syncer is connected to trigger another historical sync.
2019-05-24 11:05:29 +02:00
Olaoluwa Osuntokun
985902be27
Merge pull request #2916 from cfromknecht/split-syncer-query-reply
discovery: make gossip replies synchronous
2019-04-29 17:40:13 -07:00
Johan T. Halseth
ee257fd0eb
multi: move Route to sub-pkg routing/route 2019-04-29 14:52:33 +02:00
Conner Fromknecht
bf4543e2bd
discovery/syncer: make gossip sends synchronous
This commit makes all replies in the gossip syncer synchronous, meaning
that they will wait for each message to be successfully written to the
remote peer before attempting to send the next. This helps throttle
messages the remote peer has requested, preventing unintended
disconnects when the remote peer is slow to process messages. This
changes also helps out congestion in the peer by forcing the syncer to
buffer the messages instead of dumping them into the peer's queue.
2019-04-26 20:05:10 -07:00
Wilmer Paulino
d68842ee9e
discovery: queue active syncers until initial historical sync signal
In this commit, we begin to queue any active syncers until the initial
historical sync has completed. We do this to ensure we can properly
handle any new channel updates at tip. This is required for fresh nodes
that are syncing the channel graph for the first time. If we begin
accepting updates at tip while the initial historical sync is still
ongoing, then we risk not processing certain updates since we've yet to
learn of the channels themselves.
2019-04-24 13:20:57 -07:00
Wilmer Paulino
07136a5bc2
discovery: handle initial historical sync disconnection
In this commit, we add logic to handle a peer with whom we're performing
an initial historical sync disconnecting. This is required to ensure we
get as much of the graph as possible when starting a fresh node. It will
also serve useful to ensure we do not get stalled once we prevent active
GossipSyncers from starting until the initial historical sync has
completed.
2019-04-24 13:20:55 -07:00
Wilmer Paulino
72e9674cff
discovery: simplify chooseRandomSyncer helper 2019-04-24 13:20:16 -07:00
Wilmer Paulino
29baa12254
discovery: synchronize new/stale GossipSyncers with syncerHandler
Now that the roundRobinHandler is no longer present, this commit aims to
clean up and simplify some of the logic surrounding initializing/tearing
down new/stale GossipSyncers from the SyncManager. Along the way, we
also synchronize these calls with the syncerHandler, which will serve
useful in future work that allows us to recovery from initial historical
sync disconnections.
2019-04-24 13:19:09 -07:00
Wilmer Paulino
5db2cf6273
discovery+server: remove roundRobinHandler and related code
Since ActiveSync GossipSyncers no longer synchronize our state with the
remote peers, none of the logic surrounding the round-robin is required
within the SyncManager.
2019-04-24 13:19:07 -07:00
Wilmer Paulino
4bb4b0fe4e discovery: increase DefaultHistoricalSyncInterval to one hour
Assuming a graph size of 50,000 channels, an interval of 20 minutes
would cause nodes to consume about 600MB per month in bandwidth doing
these routine historical sync spot checks. In this commit, we increase
to one hour, which consumes about 300MB per month.
2019-04-11 15:46:22 -07:00
Olaoluwa Osuntokun
3e5a6f1022
Merge pull request #2905 from cfromknecht/split-chunk-size
discovery: make batch size distinct from chunk size, reduce to 500
2019-04-09 19:27:20 -07:00
Conner Fromknecht
a4b4fe666a
discovery: make batch size distinct from chunk size, reduce to 500
This commit reduces the number of channels a syncer will request from
the remote node in a single QueryShortChanIDs message. The current size
is derived from the chunkSize, which is meant to signal the maximum
number of short chan ids that can fit in a single ReplyChannelRange
message. For EncodingSortedPlain, this number is 8000, and we use the
same number to dictate the size of the batch from the remote peer.

We modify this by introducing a separately configurable batchSize, so
that both can be tuned independently. The value is chosen to reduce the
amount of buffering the remote party will perform, only requiring them
queue 500 responses, as opposed to 8000. In turn, this reduces larges
spikes in allocation on the remote node at the expense of a few extra
round trips for the control messages. However, will be negligible since
the control messages are much smaller than the messages being returned.
2019-04-06 15:27:26 -07:00
Conner Fromknecht
9df6af237e
discovery/sync_manager: restore lazy gossip sends 2019-04-06 03:32:03 -07:00
Wilmer Paulino
00338c5ec2
discovery: properly handle SyncManager shutdown signal 2019-04-03 19:32:56 -07:00
Wilmer Paulino
46ceaf8cf6
discovery: only replace stale active syncer if disconnected
In this commit, we address a bug where we'd attempt to replace the
stale active syncer when it transitioned to a passive syncer. This
replacement logic is only intended to happen when the active syncer
disconnects, as rotateActiveSyncerCandidate chooses and queues its own
replacement.
2019-04-03 16:43:31 -07:00
Wilmer Paulino
a188657b2f
discovery: introduce gossiper SyncManager subsystem
In this commit, we introduce a new subsystem for the gossiper: the
SyncManager. This subsystem is a major overhaul on the way the daemon
performs the graph query sync state machine with peers.

Along with this subsystem, we also introduce the concept of an active
syncer. An active syncer is simply a GossipSyncer currently operating
under an ActiveSync sync type. Before this commit, all GossipSyncer's
would act as active syncers, which means that we were receiving new
graph updates from all of them. This isn't necessary, as it greatly
increases bandwidth usage as the network grows. The SyncManager changes
this by requiring a specific number of active syncers. Once we reach
this specified number, any future peers will have a GossipSyncer with a
PassiveSync sync type.

It is responsible for three main things:

1. Choosing different peers randomly to receive graph updates from to
ensure we don't only receive them from the same set of peers.

2. Choosing different peers to force a historical sync with to ensure we
have as much of the public network as possible. The first syncer
registered with the manager will also attempt a historical sync.

3. Managing an in-order queue of active syncers where the next cannot be
started until the current one has completed its state machine to ensure
they don't overlap and request the same set of channels, which
significantly reduces bandwidth usage and addresses a number of issues.
2019-04-03 15:08:32 -07:00