In this commit, we address a bug where we'd attempt to replace the
stale active syncer when it transitioned to a passive syncer. This
replacement logic is only intended to happen when the active syncer
disconnects, as rotateActiveSyncerCandidate chooses and queues its own
replacement.
As required by the spec:
> SHOULD send all gossip messages whose timestamp is greater or equal to
first_timestamp, and less than first_timestamp plus timestamp_range.
In this commit, we introduce a new subsystem for the gossiper: the
SyncManager. This subsystem is a major overhaul on the way the daemon
performs the graph query sync state machine with peers.
Along with this subsystem, we also introduce the concept of an active
syncer. An active syncer is simply a GossipSyncer currently operating
under an ActiveSync sync type. Before this commit, all GossipSyncer's
would act as active syncers, which means that we were receiving new
graph updates from all of them. This isn't necessary, as it greatly
increases bandwidth usage as the network grows. The SyncManager changes
this by requiring a specific number of active syncers. Once we reach
this specified number, any future peers will have a GossipSyncer with a
PassiveSync sync type.
It is responsible for three main things:
1. Choosing different peers randomly to receive graph updates from to
ensure we don't only receive them from the same set of peers.
2. Choosing different peers to force a historical sync with to ensure we
have as much of the public network as possible. The first syncer
registered with the manager will also attempt a historical sync.
3. Managing an in-order queue of active syncers where the next cannot be
started until the current one has completed its state machine to ensure
they don't overlap and request the same set of channels, which
significantly reduces bandwidth usage and addresses a number of issues.
In this commit, we introduce another feature to the GossipSyncer in
which it can deliver a signal to an external caller once it reaches its
terminal chansSynced state. This is yet to be used, but will serve
useful with a round-robin sync mechanism, where we wait for to finish
syncing with a specific peer before moving on to the next.
In this commit, we introduce the ability for gossip syncers to perform
historical syncs. This allows us to reconcile any channels we're missing
that the remote peer has starting from the genesis block of the chain.
This commit serves as a prerequisite to the SyncManager, introduced in a
later commit, where we'll be able to make spot checks by performing
historical syncs with peers to ensure we have as much of the graph as
possible.
In this commit, we introduce the ability for GossipSyncer's to
transition their sync type. This allows us to be more flexible with our
gossip syncers, as we can now prevent them from receiving new graph
updates at any time. It's now possible to transition between the
different sync types, as long as the GossipSyncer has reached its
terminal chansSynced sync state. Certain transitions require some
additional wire messages to be sent, like in the case of an ActiveSync
GossipSyncer transitioning to a PassiveSync type.
With the introduction of the gossip sync manager in a later commit,
retrieving the backlog of updates within the last hour is no longer
necessary as we'll be forcing full syncs periodically.
In this commit, we introduce a new type: SyncerType. This type denotes
the type of sync a GossipSyncer is currently under. We only introduce
the two possible entry states, ActiveSync and PassiveSync. An ActiveSync
GossipSyncer will exchange channels with the remote peer and receive new
graph updates from them, while a PassiveSync GossipSyncer will not and
will only response to the remote peer's queries.
This commit does not modify the behavior and is only meant to be a
refactor.
In this commit, we address an assumption of the gossiper's recently
introduce reliable sender. The reliable sender is currently only used
for messages of unannounced channels. This makes sense as peers should
be able to retrieve messages from the network if they've previously
announced. However, within isMsgStale, we assumed that the reliable
sender would be used for every ChannelUpdate being sent, even if the
channel is already announced. Due to this, checking if the policy is
stale was unnecessary. But since this isn't the case, we should actually
be checking whether it is stale to prevent sending it later on.
In this commit, we address an issue with our router mock in which it was
not properly storing and retrieving edge policies. Previously, they were
being appended to a slice of policies, but this doesn't always work like
when you attempt to update the same edge twice. Instead, the slice can
only contain up to two entries, each one being the latest version of
each direction.
In this commit, we leverage the recently introduced zombie edge index to
quickly reject announcements for edges we've previously deemed as
zombies. Care has been taken to ensure we don't reject fresh updates for
edges we've considered zombies.
In this commit, we also allow channel updates for our channels to be
sent reliably to our channel counterparty. This is especially crucial
for private channels, since they're not announced, in order to ensure
each party can receive funds from the other side.
In this commit, we implement a new subsystem for the gossiper that
uses some of the existing logic for resending channel announcement
signatures and implements it in a way to make it message-agnostic,
meaning that any type of message can be resent. Along the way we also
modify the way this works to prevent multiple goroutines per peer _and_
message.
A peerHandler will be spawned for each peer for which we attempt to send
a message reliably to. This handler is responsible for managing requests
to reliably send messages to a peer while also taking the peer's
connection lifecycle into account by requesting notifications for when
the peer connects/disconnects. A peer connection notification is first
requested to determine when we should attempt to send any pending
messages. After the messages are sent, a peer disconnection notification
is requested to ensure we don't continue to request connection
notifications while the peer remains connected. Once there are no more
pending messages left to be sent for a given peer, the peerHandler can
be torn down.
In this commit, we add a new store within the database that'll be
responsible for storing gossip messages which we need to reliably send
to peers. This aims to replace the current messageStore that exists
within the gossiper, so much of this logic is borrowed from there.
One of the main differences between the two is that we now index
messages with a new key format in which we take into account the
message's type. This allows us to store different messages for a
specific channel with a peer. The old key format is still supported in
order to prevent a database migration.
In this commit, we alter the ValidateChannelUpdateAnn function in
ann_validation to validate a remote ChannelUpdate's message flags
and max HTLC field. If the message flag is set but the max HTLC
field is not set or vice versa, the ChannelUpdate fails validation.
Co-authored-by: Johan T. Halseth <johanth@gmail.com>
In this commit, we alter the gossiper test's helper method
that creates channel updates to include the max htlc field
in the ChannelUpdates it creates.
Co-authored-by: Johan T. Halseth <johanth@gmail.com>
In this commit, we modify the mockGraphSource's `AddEdge`
method to set the capacity of the edge it's adding to be a large
capacity.
This will enable us to test the validation of each ChannelUpdate's
max HTLC, since future validation checks will ensure the specified
max HTLC is less than total channel capacity.
In this commit:
* we partition lnwire.ChanUpdateFlag into two (ChanUpdateChanFlags and
ChanUpdateMsgFlags), from a uint16 to a pair of uint8's
* we rename the ChannelUpdate.Flags to ChannelFlags and add an
additional MessageFlags field, which will be used to indicate the
presence of the optional field HtlcMaximumMsat within the ChannelUpdate.
* we partition ChannelEdgePolicy.Flags into message and channel flags.
This change corresponds to the partitioning of the ChannelUpdate's Flags
field into MessageFlags and ChannelFlags.
Co-authored-by: Johan T. Halseth <johanth@gmail.com>
A recent commit modified the `IsNodeStale` method in the mocks to mirror
the actual implementation in the gossiper. As a result, we now expect
one less node announcement to be broadcast.
In this commit, we allow the gossiper to also broadcast the
corresponding node announcements, if we know of them, of a channel when
constructing its full proof. We do this to ensure peers (other than our
remote peer) receive all the relevant announcements for a channel.
The tests changes were made to ensure the new behavior introduced works
as intended. Previously, the node announcements for each test channel
announcement were not processed, so they never existed from the
gossiper's point of view.
This also addresses an existing flake in the integration test
`testNodeAnnouncement`. This problem arose due to the node announcement
being sent before the connection between Dave (node announcement sender)
and Alice (node announcement receiver) was initiated and the full
channel proof was constructed.
This commit passes the peer's quit signal to the
gossipSyncer when attempt to hand off gossip query
messages. This allows a rate-limited peer's read
handler to break out immediately, which would
otherwise remain stuck until the rate-limited
gossip syncer pulled the message.
This commit restructures the delivery of gossip
query related messages, such that they are delivered
directly to the gossip syncers. Gossip query rate
limiting was introduced in #1824 on a per-peer basis.
However, since all gossip query messages were being
delivered in the main event loop, the end result is
that one rate-limited peer could stall all other
peers.
In addition, since no other peers would be able to
submit gossip-related messages through the blocked
event loop, the back pressure would eventually rate
limit the read handlers of all peers as well.
The end result would be lengthy delays in reading
messages related to htlc forwarding.
The fix is to lift the delivery of gossip query
messages outside of the main event loop. With
this change, the rate limiting backpressure is
delivered only to the intended peer.
To mimic the current behaviour of the router's IsStaleNode, we make the
mockGraphSource consider a unknown node with no channels in the graph as
stale.