In #5364 we added a new error path in the `StopDaemon` method to return
an error if shutdown was attempted while a rescan/recover instance was
in progress. Since the wallet actually won't fully stop (atm)
mid-recovery, the call effectively didn't do anything in that scenario,
so we started to return an error to properly reflect that. However this
causes certain itests to fail, as during recovery, the stop attempt will
fail leading to the test itself failing.
In this commit, we wrap the calls to stop a running daemon within a
`wait.NoError` call so we'll continually try to shut down the daemon
rather than quit on the first try.
Fixes#5423.
In some rare instances it can happen that the nodes don't find each
other again after one of them has been re-created and the other one has
been restarted in the SCB tests. By making sure the re-created has the
same P2P port again as before, we make sure they can connect to each
other again successfully for executing DLP.
This commit adds a new "waiting to start" state which may be used to
query if we're still waiting to become the cluster leader. Once leader
we advance the state to "wallet not exist" or "wallet locked" given
wallet availablity.
This commit replaces most of the hard coded 10, 15, 20 and 30 second
timeouts with the default timeout. This should allow darwin users to
successfully run the parallel itests locally as well.
With the new btcd version we can specify our own listen address
generator function for any btcd nodes. This should reduce flakiness as
the previous way of getting a free port was based on just picking a
random number which lead to conflicts.
We also double the default values for connection retries and timeouts,
effectively waiting up to 4 seconds in total now.
In some tests we moved channeld.db to a temp location in order to
"time travel". This commit extends the existing semantics by moving all
files, including embedded etcd db too besides the channeld.db file.
This commit adds the icase name to the log filename, to make it simpler
to find problematic tests. Additionally after this commit we'll restart
Alice and Bob (the base harness nodes) before each icase to start with a
clean state.
To allow running multiple test tranches in parallel, we need a way to
make sure the TCP ports don't collide. We'll work with offsets for the
ports, using a different offset for each tranche.
This changes the wait during node connection to check both for the
existance as well as for the validity of the tls cert and macaroon
files.
This ensures that nodes in the process of starting up don't inadvertedly
cause a connection error due to not yet having written the entire file.
During the channel_backup_restore/restore_during_unlock itest, the node
is restored from seed and immediately restarted. Depending on specific
timing of the machine, the test harness might not have had the graph
subscription processed before the node shuts down, causing the harness
to trigger a panic.
Reducing this to a synchronous subscription attempt means node
initialization necessarily waits until the subscription is done before
attempting to restart, reducing flakiness and ensuring correct behavior.
This forces the Dial attempt to succeed or fail before proceeding with
node setup.
We also log on the node a failure to establish the graph subscription
before panicking so that we can more easily find issues.
This change makes sure that all macaroons are stored in the same
folder. This makes it possible to use the lntest package in external
projects that use loop's lndclient library which currently assumes
that the admin macaroon and subserver macaroons are in the same sub
folder of lnd's data directory.
Integration tests in external projects might not have the same folder
structure as lnd does. Therefore we want to allow the path to the
lnd itest binary to be configurable.
When using the lntest package for itests in external projects, it
is necessary to access a harness node's configuration, for example
to get its data directory on disk. This commit exports that
configuration.
This changes the HarnessNode structure to hold onto the client grpc
connection made during startup so that it can close it during shutdown.
This is needed because the grpc.Dial function spins a new goroutine that
attempts to maintain an open connection to the target endpoint and
without calling Close() in the connection while shutting down the node
we leak this goroutine to the rest of the tests.
This changes TCP port selection in integration tests from being
sequential, based on the node ID to being sequential but tested before
assigment.
This should reduce the number of flaky tests that fail due to the port
already being used by another process in the CI server.