This log can be "spammy" while nodes throughout the network have yet to
upgrade to v0.13.0-beta, which includes several enhancements to prevent
the broadcast of zombie edges/updates.
In this commit, we update the existing zombie resurrection test to
ensure that if we prune an edge and another pubkey is marked as nil,
that we only accept a resurrection channel update from the node the we
originally pruned if the pruning decision was one sided.
In this commit, we make a change to always add chan announcements to the
reject cache if we didn't reject them for already existing. Without
this change, if we end up rejecting a channel announcement say because
the channel is already spent or the funding transaction doesn't exist,
then we'll end up continually re-validating the same set of channels we
know will fail, when they're sent to us by peers.
Fixes#5191
In order to be consistent with other sub systems an error is now
returned from the Stop functions.
This also allows writing a generic cleanup mechanism to stop all
sub systems in case of a failure.
Previously, we would always allow dependent jobs to be processed,
regardless of the result of its parent job's validation. This isn't
correct, as a parent job contains actions necessary to successfully
process a dependent job. A prime example of this can be found within the
AuthenticatedGossiper, where an incoming channel announcement and update
are both processed, but if the channel announcement job fails to
complete, then the gossiper is unable to properly validate the update.
This commit aims to address this by preventing the dependent jobs to
run.
Messages:
- UpdateFulfillHTLC
- UpdateFee
- UpdateFailMalformedHTLC
- UpdateFailHTLC
- UpdateAddHTLC
- Shutdown
- RevokeAndAck
- ReplyShortChanIDsEnd
- ReplyChannelRange
- QueryShortChanIDs
- QueryChannelRange
- NodeAnnouncement
- Init
- GossipTimestampRange
- FundingSigned
- FundingLocked
- FundingCreated
- CommitSig
- ClosingSigned
- ChannelUpdate
- ChannelReestablish
- ChannelAnnouncement
- AnnounceSignatures
lnwire: update quickcheck tests, use constant for Error
multi: update unit tests to pass deep equal assertions with messages
In this commit, we update a series of unit tests in the code base to now
pass due to the new wire message encode/decode logic. In many instances,
we'll now manually set the extra bytes to an empty byte slice to avoid
comparisons that fail due to one message having an empty byte slice and
the other having a nil pointer.
In order to prep for allowing TLV extensions for the `ReplyChannelRange`
and `QueryChannelRange` messages, we'll need to remove the struct
embedding as is. If we don't remove this, then we'll attempt to decode
TLV extensions from both the embedded and outer struct.
All relevant call sites have been updated to reflect this minor change.
The recently added gossip throttling was shown to be too aggressive,
especially with our auto channel enable/disable signaling. We switch to
a token bucket based system instead as it's based on time, rather than a
block height which isn't constantly updated at a given rate.
Since the batch interval can potentially be long, adding local updates
to the graph could be slow. This would slow down operations like adding
our own channel update and announcements during the funding process, and
updating edge policies for local channels.
Now we instead check whether the update is remote or not, and only for
remote updates use the SchedulerOption to lazily add them to the graph.
We do this instead of using the source of the AnnounceSignatures
message, as we filter out the source when broadcasting any
announcements, leading to the remote node not receiving our channel
update. Note that this is done more for the sake of correctness and to
address a flake within the integration tests, as channel updates are
sent directly and reliably to channel counterparts.
Currently when numgraphsyncpeers=0, lnd will still attempt to perform
an initial historical sync. We change this behavior here to forgoe
historical sync entirely when numgraphsyncpeers is zero, since the
routing table isn't being updated anyway while the node is active.
This permits a no-graph lnd mode where no syncing occurs at all.
AFAICT it's not possible to flip back from bein synced_to_chain, so we
remove the underlying call that could reflect this. The method is moved
into the test file since it's still used to test correctness of other
portions of the flow.
Rather than performing this call in the SyncManager, we give each
gossipSyncer the ability to mark the first sync completed. This permits
pinned syncers to contribute towards the rpc-level synced_to_graph
value, allowing the value to be true after the first pinned syncer or
regular syncer complets. Unlinke regular syncers, pinned syncers can
proceed in parallel possibly decreasing the waiting time if consumers
rely on this field before proceeding to load their application.
A pinned syncer is an ActiveSyncer that is configured to always remain
active for the lifetime of the connection. Pinned syncers do not count
towards the total NumActiveSyncer count, which are rotated periodically.
This features allows nodes to more tightly synchronize their routing
tables by ensuring they are always receiving gossip from distinguished
subset of peers.
We do this instead of using the source of the AnnounceSignatures
message, as we filter out the source when broadcasting any
announcements, leading to the remote node not receiving our channel
update. Note that this is done more for the sake of correctness and to
address a flake within the integration tests, as channel updates are
sent directly and reliably to channel counterparts.
As similarly done with premature channel announcements, we'll no longer
allow premature channel updates to be rebroadcast once mature. This is
no longer necessary as channel announcements that we're not aware of are
usually broadcast to us with their accompanying channel updates.
In this commit, we add a new option to toggle gossip rate limiting. This
new option can be useful in contexts that require near instant
propagation of gossip messages like integration tests.
This change was largely motivated by an increase in high disk usage as a
result of channel update spam. With an in memory graph, this would've
gone mostly undetected except for the increased bandwidth usage, which
this doesn't aim to solve yet. To minimize the effects to disks, we
begin to rate limit channel updates in two ways. Keep alive updates,
those which only increase their timestamps to signal liveliness, are now
limited to one per lnd's rebroadcast interval (current default of 24H).
Non keep alive updates are now limited to one per block per direction.
This allows for a 1000 different validation operations to proceed
concurrently. Now that we are batching operations at the db level, the
average number of outstanding requests will be higher since the commit
latency has increased. To compensate, we allow for more outstanding
requests to keep the gossiper busy while batches are constructed.
Similarly as with kvdb.View this commits adds a reset closure to the
kvdb.Update call in order to be able to reset external state if the
underlying db backend needs to retry the transaction.
This commit adds a reset() closure to the kvdb.View function which will
be called before each retry (including the first) of the view
transaction. The reset() closure can be used to reset external state
(eg slices or maps) where the view closure puts intermediate results.