Follow up labelling of external transactions with labels for the
transaction types we create within lnd. Since these labels will live
a life of string matching, a version number and rigid format is added
so that string matching is less painful. We start out with channel ID,
where available, and a transaction "type". External labels, added in a
previous PR, are not updated to this new versioned label because they
are not lnd-initiated transactions. Label matching can check this case,
then check for a version number.
This reduces the flakiness of the CPFP test by asserting the wallet has
seen the unspent output before attempting to perform the walletkit's
BumpFee method.
Previously the attempt to bump the fee of the target transaction could
be made before the wallet had had a chance to fully process the
transaction, causing a flaky error.
This switches a few call sites that used a different timeout when
openening channels to the correct openChannelTimeout, which better deal
with flakes in the CI.
This replaces an outstanding sleep for a check for a specific state
during the test for watchtower use: specifically, that the backup has
been sent to the watchtower prior to shutting down Dave.
This reduces flakiness in the test that could occur if the Dave shutdown
without the backup being comitted to the watchtower, causing the rest of
the test to fail.
This changes the wait during node connection to check both for the
existance as well as for the validity of the tls cert and macaroon
files.
This ensures that nodes in the process of starting up don't inadvertedly
cause a connection error due to not yet having written the entire file.
During the channel_backup_restore/restore_during_unlock itest, the node
is restored from seed and immediately restarted. Depending on specific
timing of the machine, the test harness might not have had the graph
subscription processed before the node shuts down, causing the harness
to trigger a panic.
Reducing this to a synchronous subscription attempt means node
initialization necessarily waits until the subscription is done before
attempting to restart, reducing flakiness and ensuring correct behavior.
This forces the Dial attempt to succeed or fail before proceeding with
node setup.
We also log on the node a failure to establish the graph subscription
before panicking so that we can more easily find issues.
This improves the error reporting for the harness' CloseChannel so that
the exact step where closure fails can be better indicated.
This is to help debug some flaky failures in the CI.
In this commit, we split off the protocol options into a normal and
legacy sub-config. The legacy sub-config protected by a built tag, and
will only be populated if thet tag is set. Legacy options now have a
`legacy` prefix. So `--protocol.legacyonion` is now `--protocol.onion`,
and `--protocol.committweak`, is now `--protocol.legacy.committweak`.
We also create a new experimental protocol feature sub-config for newer
features that may not yet been fully complete, so they require a build
tag.
This is useful when we wish to have a channel frozen for a specific
amount of blocks after its confirmation. This could also be done with an
absolute thaw height, but it does not suit cases where a strict block
delta needs to be enforced, as it's not possible to know for certain
when a channel will be included in the chain. To work around this, we
add a relative interpretation of the field, where if its value is below
500,000, then it's interpreted as a relative height. This approach
allows us to prevent further database modifications to account for a
relative thaw height.
In this commit we add the ability to intercept forwarded htlc packets
straight from the RPC layer. The RPC layer handles a bidrectional stream
that comminucates to the client the intercepted packets and handles its
response by coordinating with the interceptable switch.
As part of the preparation to the switch interceptor feature, this
function is changed to return error instead of error channel that
is closed automatically.
Returning an error channel has become complex to maintain and
implement when adding more asynchronous flows to the switch.
The change doesn't affect the current behavior which logs the
errors as before.
This commit extends the link with a new synchronous delivery point for
local UpdateAddHTLC messages. The switch method SendHTLC is updated to
use this delivery point and thereby becomes a synchronous call.
For MPP payments, synchronous hand-off is important. Otherwise the next
pathfinding round could start without the channel balance updated yet.
This fixes an issue where the contract court could leave a completely
swept commit tx unresolved if it was swept by the remote party.
This could happen if (our) commit tx just published was actually a
previously revoked state, in which case the remote party would claim the
funds via a justice transaction.
This manifested itself in the testRevokedCloseRetribution integration
test where at the end of the test Bob was left with a pending channel
that never resolved itself.
The message in the response stream changed. Rename the calls themselves,
to prevent older applications from getting decode errors. Especially
troublesome is the case where the request is executed (send payment),
but the application can't read the outcome (payment sent or not?)
This commit fixes the inconsistency between the payment state as
reported by routerrpc.SendPayment/routerrpc.TrackPayment and the main
rpc ListPayments call.
In addition to that, payment state changes are now sent out for every
state change. This opens the door to user interfaces giving more
feedback to the user about the payment process. This is especially
interesting for multi-part payments.
This commit removes the overflowQueue from the link. We do so in order
to promote better UX for senders, so that HTLCs are failed faster when
the commitment is full. This gives the sender the opportunity to try
another, more open path, rather than perceive the HTLC as being stuck.
At the same time, we remove the total number of active goroutines in lnd
by a factor of N where N is the number of active channels.
This is mainly motivated by a now fixed bug in the wallet in which
change addresses could at times be created outside of the default key
scopes. Recovery only used to be performed on the default key scopes, so
ideally this test case would've caught the bug earlier.
Move enum out of CloseSummary struct for more general use. This does
not change the encoding of the enum, and will only cause compile time
errors for existing clients. This enum has not been included in a
release yet, so we can make this move without much disruption.
In #4130, OpenChannel was changed to assert that the wallet is fully
synced before allowing a channel open. This introduced flakes on travis,
which are resolved here by using a wait predicate when calling
OpenChannel.
Note there is one existing call that was not converted, because it is
interested in the returned error. This call does not have a wait
predicate surrounding it, but this shouldn't cause a flake because other
channels are opened earlier in the test that will have already waited
for the wallet to sync up.
testSendToRouteMultiPath tests that we are able to successfully route a
payment using multiple shards across different paths, by using SendToRoute.
Co-authored-by: Joost Jager <joost.jager@gmail.com>
In preparation for MPP we return the terminal errors recorded with the
control tower. The reason is that we cannot return immediately when a
shard fails for MPP, since there might be more shards in flight that we
must wait for. For that reason we instead mark the payment failed in the
control tower, then return this error when we inspect the payment,
seeing it has been failed and there are no shards in flight.
These tests exercise the different ways of sweeping a commitment, so
we'll cover the modified scripts used for anchor commitments and
spending the anchor itself by both parties.
Co-authored-by: Johan T. Halseth <johanth@gmail.com>
Fixes a subtle bug where the outer scope predErr was hidden when the
return value of findForceClosedChannel was stored in a newly
defined variable with the same name.
Start anchor sweep attempts immediately after the commitment transaction
has been published. This makes the anchor known to the sweeper and
allows the user to bump the fee on it to get their commitment
transaction confirmed in case the fee committed too is insufficient for
timely confirmation.
In this commit, we extend the current SCB recovery tests to also cover
the new anchor commitment type. We only add a single test that covers
the most common case to avoid needing to tests all cases for all
commitment types which is being done in a follow up PR.
The synchronous call to get all channel backups also include
channels that are pending at the moment of the call. A previous
commit added pending channels to the file based backup as well. So
this is the last backup method that needs to be adjusted to also
contain unconfirmed channels.
Update channel updates and subscription itest to check that close
initiator is appropriately set for cooperative and force closes for the
local and remote party.
This commit adds PendingOpenChannel to SubscribeChannelEvents stream in
the gRPC API.
This is useful for keeping track of channel openings that Autopilot does.
It can also be used for the non-initator side of a channel opening to keep
track of channel openings.
To ensure lnd is able to pick up an on-chain preimage properly after a
restart, we suspend Alice and check that the payment is listed correctly
as succeeded after a restart.
This change makes sure that all macaroons are stored in the same
folder. This makes it possible to use the lntest package in external
projects that use loop's lndclient library which currently assumes
that the admin macaroon and subserver macaroons are in the same sub
folder of lnd's data directory.
Integration tests in external projects might not have the same folder
structure as lnd does. Therefore we want to allow the path to the
lnd itest binary to be configurable.
When using the lntest package for itests in external projects, it
is necessary to access a harness node's configuration, for example
to get its data directory on disk. This commit exports that
configuration.
This commit constructs a helper closure assertAmountSent that can be
reused by other functions. The closure returns an error so that it can
be used with wait.NoError or the new wait.InvariantNoError. The latter
is added since the predicate could otherwise pass immediately for the
sphinx_replay_persistence tests, but change shortly after. It also
rounds out the wait package so that we offer all combinations of
predicate and no-error style waits.
This changes the HarnessNode structure to hold onto the client grpc
connection made during startup so that it can close it during shutdown.
This is needed because the grpc.Dial function spins a new goroutine that
attempts to maintain an open connection to the target endpoint and
without calling Close() in the connection while shutting down the node
we leak this goroutine to the rest of the tests.
Since CSV locked outputs specifies the first block where they are
allowed to be included, they can actually be added one block earlier
into the mempool.
This led to a flake, where the sweep tx was already in the mempool at
the time we mined the last block, causing the next mempool check to
fail.
This commit adds an itest assertion to check that a coop closed
channel's status is properly refelcted in list channels. We also fix a
race condition that prevented the rpc from being externally consistent
by marking the close sooner in the pipeline.
This changes TCP port selection in integration tests from being
sequential, based on the node ID to being sequential but tested before
assigment.
This should reduce the number of flaky tests that fail due to the port
already being used by another process in the CI server.
Refresh channel memory state whenever the short channel id is refreshed.
This is to make the in-memory channel consistent with the disk data.
Fixes#3765.
This fixes an issue that would lead to a flake during intergration
tests. Carol would start up with a outdated state and attempt to force
close the channel. At the same time she would connect to Dave,
triggering the dataloss protection. Dave would respond by force closing
the channel, and Dave transaction would in some cases have a higher fee,
resulting Carol's tx being replaced.
We fix this by suspending Dave until Carol's close tx is mined.
In this commit, we update the `AbandonChannel` method to also remove the
state from the countract court as well as the channel graph. Abandoning
a channel is now a three step process: remove from the open channel
state, remove from the graph, remove from the contract court. Between
any step it's possible that the users restarts the process all over
again. As a result, each of the steps below are intended to be
idempotent.
We also update the integration test to assert that no channel is found
in the graph any longer. Before this commit, this test would fail as the
channel was still found in the graph, which can cause other issues for
an operational daemon.
Fixes#3716.
This commit beings the process of deprecating unsafe-disconnect. Many
moons ago this was disallowed to prevent concurency bugs surrounding
reconnect. Despite the name, it has been safe to enable this feature for
well over a year, as several PRs have been merged that addressed the
possible issues that existed when the feature was added.
In this commit, we refactor the testSingleHopSendToRoute test to support
table driven tests for various endpoints and payment types. Currently
only the main rpcserver's SendToRoute is tested, so we also add
support the SendToRouteSync and the routerrpc's SendToRoute.
The tests are also modified to have each endpoint perform a single-hop,
single-shot MPP payment. This asserts that the Hop messages are being
properly unmarshalled and that setting correctly yields a successful
payment. At the momemnt the receiver does not actually verify or use the
MPP fields presented in the onion, though this test will be expanded
later as those pieces are assembled.
We might hit a connection refused error in cases where the peer connects
to us exactly as we try to connect to it. We retry the connection within
a wait predicat, as it should be the case that the other peer
establishes the connection, and the two peers actually connects.
We add a wait predicate to make sure the node's on-chain balance is
restored before continuing the restore test case.
This is needed since the DLP test scenario includes several restarts of
the node, and if the node isn't done scanning for on-chain balance
before the restart happens, it would be unlocked without a recovery
window, causing funds to be left undiscovered.
Since the ErrorCodes are not part of the spec, they cannot be read by
other implementations.
Instead of only sending the error code we therefore send the complete
error message. This will have the same effect at the client, as it will
just get the full error instead of the code indicating which error it
is. It will also be compatible with other impls.
Note that the GRPC error codes will change, since we don't set them
anymore.
This changes the defer function in the test for channel backups to
correctly close over the 'dave' variable.
Without this closure, the shutdownAndAssert call would attempt to
shutdown the original (non-restored) dave instead of the most recently
created (restored) dave, causing a leak of a node during tests.