In this commit, we introduce a new subsystem for the gossiper: the
SyncManager. This subsystem is a major overhaul on the way the daemon
performs the graph query sync state machine with peers.
Along with this subsystem, we also introduce the concept of an active
syncer. An active syncer is simply a GossipSyncer currently operating
under an ActiveSync sync type. Before this commit, all GossipSyncer's
would act as active syncers, which means that we were receiving new
graph updates from all of them. This isn't necessary, as it greatly
increases bandwidth usage as the network grows. The SyncManager changes
this by requiring a specific number of active syncers. Once we reach
this specified number, any future peers will have a GossipSyncer with a
PassiveSync sync type.
It is responsible for three main things:
1. Choosing different peers randomly to receive graph updates from to
ensure we don't only receive them from the same set of peers.
2. Choosing different peers to force a historical sync with to ensure we
have as much of the public network as possible. The first syncer
registered with the manager will also attempt a historical sync.
3. Managing an in-order queue of active syncers where the next cannot be
started until the current one has completed its state machine to ensure
they don't overlap and request the same set of channels, which
significantly reduces bandwidth usage and addresses a number of issues.
In this commit, we introduce another feature to the GossipSyncer in
which it can deliver a signal to an external caller once it reaches its
terminal chansSynced state. This is yet to be used, but will serve
useful with a round-robin sync mechanism, where we wait for to finish
syncing with a specific peer before moving on to the next.
In this commit, we introduce the ability for gossip syncers to perform
historical syncs. This allows us to reconcile any channels we're missing
that the remote peer has starting from the genesis block of the chain.
This commit serves as a prerequisite to the SyncManager, introduced in a
later commit, where we'll be able to make spot checks by performing
historical syncs with peers to ensure we have as much of the graph as
possible.
In this commit, we introduce the ability for GossipSyncer's to
transition their sync type. This allows us to be more flexible with our
gossip syncers, as we can now prevent them from receiving new graph
updates at any time. It's now possible to transition between the
different sync types, as long as the GossipSyncer has reached its
terminal chansSynced sync state. Certain transitions require some
additional wire messages to be sent, like in the case of an ActiveSync
GossipSyncer transitioning to a PassiveSync type.
With the introduction of the gossip sync manager in a later commit,
retrieving the backlog of updates within the last hour is no longer
necessary as we'll be forcing full syncs periodically.
In this commit, we introduce a new type: SyncerType. This type denotes
the type of sync a GossipSyncer is currently under. We only introduce
the two possible entry states, ActiveSync and PassiveSync. An ActiveSync
GossipSyncer will exchange channels with the remote peer and receive new
graph updates from them, while a PassiveSync GossipSyncer will not and
will only response to the remote peer's queries.
This commit does not modify the behavior and is only meant to be a
refactor.
In this commit, we modify the set of default signals we attempt to catch in
order to execute a graceful shutdown. Before this commit, we would attempt
to catch/register for `SIGSTOP`. There're two issues with this
1. `SIGSTOP` is meant to be used in combination with `SIGCONCT` to allow a
process to be paused/resumed. Therefore, our action of shutting down once
received was incorrect.
2. `SIGSTOP` doesn't exist on windows, so users aren't able to compile on
this platform without modifying the codebase.
This commit fixes a bug that would cause the
notifier not to commit spend hints for items
that are not found. This is done by calling
UpdateSpendDetails with a nil detail, permitting
the notifier to begin updating the spend hints
with new blocks that arrive at tip. The change
is designed to mimic the behavior for historical
confirmation dispatch.
The symptom of this bug is needing to do many
long rescans on startup, even if new blocks
arrive after the rescan had completed. With
this change, nodes will have to do the scans
once more before their hints will be properly
updated. Restarts from then on should not
have this behavior.
This commit introduces the Validator interface, which
is intended to be implemented by any sub configs. It
specifies a Validate() error method that should fail
if a sub configuration contains any invalid or insane
parameters.
In addition, a package-level Validate method can be
used to check a variadic number of sub configs
implementing the Validator interface. This allows the
primary config struct to be extended via targeted
and/or specialized sub configs, and validate all of
them in sequence without bloating the main package
with the actual validation logic.
In this commit, we add 4 new itests for exercising the SCB restore
process via 4 primary scenarios: recover from backup using RPC, recover
from file using RPC, recover channels during init/creation, recover
channels during unlock. With all fields populated there're a total of 24
new scenarios to cover. At the time of authoring of this commit, the
other scenarios (bits are: initiator, updates, private) have been left
out for now, as they increased the run time of the integration tests
significantly.
In this commit, we modify the core testDataLossProtection test to
extract the primary DLP assertion logic into a new function. We do this,
as the upcoming SCB tests will fallback to this test after some initial
set up.
In this commit, we update all uses of the `getChanPointFundingTxid` to
match the new function signature. We no longer need to convert to a
chainhash.Hash, as the method does so underneath now.
In this commit, we modify the `RestoreNodeWithSeed` and `RestartNode`
methods to also accept an SCB. This will be useful in new integration
tests to properly exercise the various restore/restart scenarios using
static channel backups.
During the restore process, it may be possible that we have already
heard about our prior edge from a node on the network (or our channel
peers). As a result, we shouldn't exit if this happens, and instead
should continue with the rest of the restoration process.
In this commit, we convert the server's Start/Stop methods to use the
sync.Once. We do this in order to fix concurrency issues that would
allow certain queries to be sent to the server before it has actually
fully start up. Before this commit, we would set started to 1 at the
very top of the method, allowing certain queries to pass before the rest
of the daemon was had started up.
In order to fix this issue, we've converted the server to using a
sync.Once, and two new atomic variables for clients to query to see if
the server has fully started up, or is in the process of stopping.