In some rare instances it can happen that the nodes don't find each
other again after one of them has been re-created and the other one has
been restarted in the SCB tests. By making sure the re-created has the
same P2P port again as before, we make sure they can connect to each
other again successfully for executing DLP.
Since there is a lot of connecting and disconnecting between nodes in
the channel backup tests, we try to speed up that process by lowering
the min backoff from 1 second to 50 milliseconds. We also make sure we
never wait more than 1 second if it does take multple attempts. This
should sum up and hopefully speed up our tests a bit.
The Windows virtual machine that Travis runs the integration tests on
seems to be slower than the other machines. We try to increase the
stability of the tests by cutting the number of parallel running suites
in half. This will come at the cost of longer execution time but
hopefully with a better stability in return.
It seems #5246 introduced a subtle bug that lead to the error "out of
order block: expecting height=1, got height=XXX" some times during
startup. Apparently it can happen that during pruning of the graph tip
some blocks can come in before we start our chain view and the new block
subscription. By querying the chain backend for the best height before
syncing with the graph we ensure that we never miss a block.
The router has a lot of work to do for each block. So it might be
possible that it isn't yet up to date with the most recent block,
even if the wallet is. This can happen in environments with high CPU
load (such as parallel itests). Since the `synced_to_chain` flag in
the response of this call is used by many wallets (and also our
itests) to make sure everything's up to date, we add the router's
state to it. So the flag will only toggle to true once the router was
also able to catch up.
The router subsystem has its own goroutine that receives chain updates
and then does its (quite time consuming) work on each new block. To make
it possible to find out what block the router currently is synced to, we
export its internal best height through a new method.
closed
This commit makes the handoff procedure between the breachabiter and
chainwatcher use a function closure to mark the channel pending closed
in the DB. Doing it this way we know that the channel has been markd
pending closed in the DB when ProcessACK returns.
The reason we do this is that we really need a "two-way ACK" to have the
breacharbiter know it can go on with the breach handling. Earlier it
would just send the ACK on the channel and continue. This lead to a race
where breach handling could finish before the chain watcher had marked
the channel pending closed in the database, which again lead to the
breacharbiter failing to mark the channel fully closed.
We saw this causing flakes during itests.
This commit adds a new "waiting to start" state which may be used to
query if we're still waiting to become the cluster leader. Once leader
we advance the state to "wallet not exist" or "wallet locked" given
wallet availablity.
This commit also changes the order of DB init to be run after the RPC
server is up. This will allow us to later add an RPC endpoint to be used
to query leadership status.