In this commit, we add strict zombie pruning as a config level param.
This allow us to add the option for those that want a tighter graph, and
not change the default composition of the channel graph for most users
over night.
In addition, we expand the test case slightly by testing that the self
node won't be pruned, but also that if there's a node with only a single
known stale edge, then both variants will prune that edge.
This change was largely motivated by an increase in high disk usage as a
result of channel update spam. With an in memory graph, this would've
gone mostly undetected except for the increased bandwidth usage, which
this doesn't aim to solve yet. To minimize the effects to disks, we
begin to rate limit channel updates in two ways. Keep alive updates,
those which only increase their timestamps to signal liveliness, are now
limited to one per lnd's rebroadcast interval (current default of 24H).
Non keep alive updates are now limited to one per block per direction.
Similarly as with kvdb.View this commits adds a reset closure to the
kvdb.Update call in order to be able to reset external state if the
underlying db backend needs to retry the transaction.
This commit adds a reset() closure to the kvdb.View function which will
be called before each retry (including the first) of the view
transaction. The reset() closure can be used to reset external state
(eg slices or maps) where the view closure puts intermediate results.
The explicit `bbolt` dep is gone, as we depend on `kvdb`, which is
actually `walletdb`, which has its own module that defines the proper
`bbolt` version.
In this commit, we migrate all the code in `channeldb` to only reference
the new `kvdb` package rather than `bbolt` directly.
In many instances, we need to add two version to fetch a bucket as both
read and write when needed. As an example, we add a new
`fetchChanBucketRw` function. This function is identical to
`fetchChanBucket`, but it will be used to fetch the main channel bucket
for all _write_ transactions. We need a new method as you can pass a
write transaction where a read is accepted, but not the other way around
due to the stronger typing of the new `kvdb` package.
Previously we would return nil features when we didn't have a node
announcement or a given node. With this change, we can always assume the
feature vector is populated during pathfinding.
This commit adds an index bucket, disabledEdgePolicyBucket, for those
ChannelEdgePolicy with disabled bit on.
The main purpose is to be able to iterate over these fast when prune is
needed without the need for iterating the whole graph.
The entry points for accessing this index are:
1. When updating ChannelEdgePolicy - insert an entry.
2. When deleting ChannelEdge - delete the associated entries.
3. When querying for disabled channels - implemented DisabledChannelIDs
function
This commit modifies the nodeWithDist struct to use a route.Vertex
instead of a *channeldb.LightningNode. This change, coupled with
the new ForEachNodeChannel function, allows the findPath Djikstra's
algorithm to cut down on database lookups since we no longer need
to call the FetchOtherNode function.
In this commit, we refactor DeleteChannelEdge to use ChannelIDs rather
than ChannelPoints. We do this as the only use of DeleteChannelEdge is
when we are pruning zombie channels from our graph. When running under a
light client, we are unable to obtain the ChannelPoint of each edge due
to the expensive operations required to do so. As a stop-gap, we'll
resort towards using an edge's ChannelID instead, which is already
gossiped between nodes.
This commit removes the MarkEdgeZombie method from channeldb. This
method is currently not used in any live code paths in production, and
is only used in unit tests. However, incorrect usage of this method
could result in an edge being present in both the zombie and channel
indexes, which deviates from any state we would expect to see in
production. Removing the method will help mitigate the potential for
writing incorrect unit tests in the future, by forcing zombie edges to
be created via the relevant, production APIs, e.g. DeleteChannelEdge.
The existing unit tests that use this method have been modified to use
the DeleteChannelEdge instead. No regressions were discovered in the
process.
This commit modifies FetchChanInfos to skip any channels that are not in
the graph at the time of the call. Currently the entire call will fail
if the edge is not found, which stalls a gossip sync in the following
scenario:
1. Remote peer queries for a channel range
2. We return the set of channel ids in that range
3. A channel from that set is removed from the graph, e.g. via close.
4. Remote peer queries for removed edge, causing the query to fail.
To remedy this, we will now skip any edges that are not known in the
database at the time of the query. This prevents the syncer state
machines from halting, which otherwise could only be resolved by
disconnecting and reconnecting.
This commit modifies FilterKnownChanIDs to skip edges that
we ourselves have deemed zombies. This prevents us from requesting
the updates from them, as this wastes bandwidth and cpu cycles.
In this commit, we extend the graph's FetchChannelEdgesByID and
HasChannelEdge methods to also check the zombie index whenever the edge
to be looked up doesn't exist within the edge index. We do this to
signal to callers that the edge is known, but only as a zombie, and the
only information that we have about the edge are the node public keys of
the two parties involved in the edge.
In the event that an edge does exist within the zombie index, we make
an additional check on edge policies to ensure they are not within the
router's pruning window, indicating that it is a fresh update.
We mark the edges as zombies when pruning them to ensure we don't
attempt to reprocess them later on. This also applies to channels that
have been removed from the graph due to being stale.
In this commit, we add a zombie edge index to the database. This allows
us to quickly determine across restarts whether we're attempting to
process an edge we've previously deemed as zombie.
In this commit, we convert all the `Update` calls which are serial, to
use `Batch` calls which are optimistically batched together for
concurrent writers. This should increase performance slightly during the
initial graph sync, and also updates at tip as we can coalesce more of
these individual transactions into a single transaction.