This commit adds a brief delay when sending our channel reestablish
message if the link contains a restored channel to ensure we first have
a stable connection. Sending the message will cause the remote peer to
force close the channel, which currently may not be resumed reliably if
the connection is being torn town simultaneously. This delay can be
removed after the force close is reliable, but in the meantime it
improves the reliability of successfully closing out the channel and
allows the `channel_backup_restore/restore_during_creation` to pass
reliably.
In this commit, we modify the starting link logic to always send the
chan sync message to the remote peer in a synchronous manner. Otherwise,
it's possible that we fail very quickly below this block, and don't ever
send the message to the remote peer.
The idea of the batch counter is to increase it for commit tx updates,
so that if the commit tx cannot be updated immediately (revocation
window exhausted), the batch ticker makes sure it happens later.
The batch counter was increased for forwarded htlcs, but not for exit hop
resolutions.
This lead to the situation where the commitment tx would not be updated,
even though the htlc was settled locally. When no other changes happen
on the channel, the htlc eventually reaches its expiry and the channel
is force closed.
This commits exposes the various parameters around going to chain and
accepting htlcs in a clear way.
In addition to this, it reverts those parameters to what they were
before the merge of commit d1076271456bdab1625ea6b52b93ca3e1bd9aed9.
This commit increase the expiry grace delta to a value above the
broadcast delta. This prevents htlcs from being accepted that would
immediately trigger a channel force close.
A correct delta is generated in server.go where there is access to
the broadcast delta and passed via the peer to the links.
Co-authored-by: Jim Posen <jim.posen@gmail.com>
Previously there was no minimum remaining blocks requirement on
forwarded htlcs, which may cause channel arbitrator to force
close the channel directly after forwarding the htlc.
Co-authored-by: Jim Posen <jim.posen@gmail.com>
This commit modifies the invoice registry to handle invoices for which
the preimage is not known yet (hodl invoices). In that case, the
resolution channel passed in from links and resolvers is stored until we
either learn the preimage or want to cancel the htlc.
This commit detaches signaling the invoice registry that an htlc was
locked in from the actually settling of the htlc.
It is a preparation for hodl invoices.
In further commits the behaviour of invoice registry becomes more
intrinsically connected to the link. This commit prepares for that by
allowing link and registry to be tested as a single unit.
In the TestChannelLinkMultiHopUnknownPaymentHash test, a preimage was
modified to trigger an unknown payment hash failure. The way the mock is
implemented, it would take the hash of that modified preimage and store
it. It basically stores a completely different invoice. For this test,
it is just as good to store no invoice at all.
Previously it could happen that an invoice was open at the time of the
LookupInvoice call, the htlc was settled because of that, but when the
SettleInvoice call was made eventually, it would fail because the
invoice was canceled in the mean time. The htlc would then be settled,
but the invoice not marked as such.
In this commit, we set a default max HTLC in the forwarding
policies of newly open channels.
The ForwardingPolicy's MaxHTLC field (added in this commit)
will later be used to decide whether an HTLC satisfies our policy before
forwarding it.
To ensure the ForwardingPolicy's MaxHTLC default matches the max HTLC
advertised in the ChannelUpdate sent out for this channel, we also add
a MaxPendingAmount() function to the lnwallet.Channel.
This commit makes use of the batched AddWitness
method of the WitnessCache, in order to avoid
performing one write for each accepted preimage.
Additionally, this fixes an existing hole in the
consistency guarantees since the batched writes
are now guaranteed to take place before accepting
the next CommitSig. Previously, these writes were
processed in an unsynchronized go routine that
could be delayed arbitrarily long before being
executed.
With this change, the async_payments_benchmarks
actually shows a slight improvement in
performance, presumably because we no longer do
an individual write per preimage, even though
the execution is now explicitly in the critical
path. There is likely also a marginal performance
improvement from the reduction in goroutine
overhead.
In this commit, we modify the WitnessCache's
AddPreimage method to accept a variadic number
of preimages. This enables callers to batch
preimage writes in performance critical areas
of the codebase, e.g. the htlcswitch.
Additionally, we lift the computation of the
witnesses' keys outside of the db transaction.
This saves us from having to do hashing inside
and blocking other callers, and limits extraneous
blocking at the call site.
This function will be used in the switch to retrieve the channel point for a link,
allowing the switch to retrieve individual channels from the database.
This commit is a step to split the lnwallet package. It puts the Input
interface and implementations in a separate package along with all their
dependencies from lnwallet.
In this commit, we deprecate the `IncorrectHtlcAmount` onion error.
We'll still decode this error to use when retrying paths, but we'll no
longer send this ourselves. The `UnknownPaymentHash` error has been
amended to also include the value of the payment as well. This allows us
to worry about one less error.
This commit modifies the behavior of the
HasActiveLink method within the switch to
only return true if the link is in the
link index and is eligible to forward
HTLCs.
The prior version returns true whenever
the link is found in the link index,
which may return true for pending
channels that are not actually active.
This commit adds conversion between the lnwire.UpdateFee message and the
new FeeUpdate PaymentDescriptor. We re-purpose the existing Amount field
in the PaymentDescriptor stuct to hold the feerate.
This commit is a preparation for the addition of new invoice
states. A database migration is not needed because we keep
the same field length and values.
In this commit, we fix a minor discrepancy with the spec. We should
return a FinalFailExpiryTooSoon error, rather than a
FinalFailIncorrectCltvExpiry error, when the last HTLC of a route (exit
hop) has an expiration height that is deemed too soon by the final
destination of the HTLC.
In this commit, we remove the per channel `sigPool` within the
`lnwallet.LightningChannel` struct. With this change, we ensure that as
the number of channels grows, the number of gouroutines idling in the
sigPool stays constant. It's the case that currently on the daemon, most
channels are likely inactive, with only a hand full actually
consistently carrying out channel updates. As a result, this change
should reduce the amount of idle CPU usage, as we have less active
goroutines in select loops.
In order to make this change, the `SigPool` itself has been publicly
exported such that outside callers can make a `SigPool` and pass it into
newly created channels. Since the sig pool now lives outside the
channel, we were also able to do away with the Stop() method on the
channel all together.
Finally, the server is the sub-system that is currently responsible for
managing the `SigPool` within lnd.
Per team decision all tests should run with "debug" flag.
Makefile updated accordingly.
A developer may still run the test from command line by "go test ...."
The commit protects against this and issue an error if needed.
In this commit we add a check to HtlcSatifiesPolicy to verify that the
time lock for the outgoing htlc that is requested in the onion packet
isn't too far in the future.
Without this check, anyone could force an unreasonably long time lock on
the forwarding node.
In this commit, we increase the fwdpkg gc interval
to avoid having it conflict with switch tests that
inspect forwarding packages. The current timeout is
a little too short on travis, and sporadically fails
TestChannelLinkCleanupSpuriousResponses, which was
added recently.
This commit moves the logic handling responses to
locally-initiated payments to be asynchronous. The
reordering of operations into handleLocalDispatch
brings a serious performance burden to the switch's
main event loop. However, the at-most once semantics
of circuit map and idempotency of cleanup methods
allows concurrent operations to run in parallel.
Prior to this commit, the async_payments_benchmark
would timeout due to the forcibly serial nature of
the prior design. With this change, there is no
perceptible difference in the benchmark OMM, even
though we've added two extra db calls.
Composes the new payment status helper methods such that
we only require one db txn per state transition. This
also allows us to remove the exclusive lock from the
control tower, and enable more concurrent requests.
In this commit, we address an issue that could arise when using the
SendToRoute RPC. In this RPC, we specify the exact hops that a payment
should take. However, within the switch, we would set a constraint for
the first hop to be any hop as long as the first peer was at the end of
it. This would cause discrepancies when attempting to use the RPC as the
payment would actually go through another hop with the same peer. We fix
this by explicitly specifying the channel ID of the first hop.
Fixes#1500.
Fixes#1515.
In this commit, we modify the readHandler w/in
the mock peer to drop messages if it is unable
to find the target link. This has led to observed
race conditions related to removing a link and still
attempting to deliver messages. By removing this,
the readHandler shouldn't fail the test as a result.
This commit increases the fwdpkg garbage collection
interval to 15s, to mitigate the likelihood of it
interfering with our unit tests related to fwdpkgs.
This commit removes the concept of "circuit deletion
forgivness" from the link. This was originally
implemented due to the strict semantics of the original
DeleteCircuit implementation, which would fail if we tried
to delete unknown circuits. Forgivness is used on startup
to ignore this error in case the circuits had already been
deleted before shutting down.
Now that the circuit deletion has been relaxed, this
behavior is no longer necessary, as requests to delete
unknown (or previously deleted) circuits will be ignored.
This is necessary for future changes regarding switch
cleanup, which may attempt to cleanup already deleted
circuits.
Previously, we would only allow deletion of circuits if all circuit keys
were found in the pending map.
In this commit, we relax this to allow for deletion of any circuits
that are found pending, and ignore those that are not found. This
is a preliminary step to cleaning up duplicate forwards that get caught
by the switch. It also allows us to gracefully handle any nodes that
are still afflicted by the split mailbox issue.
Replaces the log statement in CommitCircuits so that
it prints the circuit key of the incoming channel. This way
we avoid spewing the secp curve stored in the ErrorEncrypter.
In this commit, we thread through a link's quit channel into
routeAsync, the primary helper method allowing links to send
htlcPackets through the switch. This is intended to remove
deadlocks from happening, where the link is synchronously
blocking on forwarding packets to the switch, but also
needs to shutdown.
This commit adds a test that verifies Stop does not block
if the link is concurrently forwarding incoming Adds to
the switch. This test fails prior to the commits that
thread through the link's quit channel.
This commit modifies the default BatchTicker
implementation such that it will generate a
new ticker with each call to Start(). This
allows us to create a new ticker after
releasing an old one due to the batch
being empty.
In this commit, we prevent the htlcManager from
being woken up by the batchTicker when there is no
work to be done. Profiling has shown a significant
portion of CPU time idling, since the batch ticker
endlessly demands resources. We resolve this by only
selecting on the batch ticker when we have a
non-empty batch of downstream packets from the
switch.
This commit corrects our exit hop logic to return
FailFinalExpiryTooSoon if the following check is true:
pd.Timeout-expiryGraceDelta <= heightNow
Previously we returned FailFinalIncorrectCltvExpiry, which
should only be returned if the packet was misconstructed.
This commit replaces the debug Config struct with an empty
one, so that the command line flags are hidden in production
builds.
Production help before commit:
Tor:
--tor.active
--tor.socks=
--tor.dns=
--tor.streamisolation
--tor.control=
--tor.v2
--tor.v2privatekeypath=
--tor.v3
hodl:
--hodl.exit-settle
--hodl.add-incoming
--hodl.settle-incoming
--hodl.fail-incoming
--hodl.add-outgoing
--hodl.settle-outgoing
--hodl.fail-outgoing
--hodl.commit
--hodl.bogus-settle
Help Options:
-h, --help
Production help after commit:
Tor:
--tor.active
--tor.socks=
--tor.dns=
--tor.streamisolation
--tor.control=
--tor.v2
--tor.v2privatekeypath=
--tor.v3
Help Options:
-h, --help
In this commit, we modify the InvoiceDatabase slightly to allow the link
to record what the final payment about for an invoice was. It may be the
case that the invoice actually had no specified value, or that the payer
paid more than necessary. As a result, it's important that our on-disk
records properly reflect this.
To fix this issue, the SettleInvoice method now also accepts the final
amount paid.
Fixes#856.
This commit corrects the distribution used to
schedule a link's randomized backoff for fee
updates. Currently, our algorithm biases the
lowest value in the range, with probability
equal to lower/upper, or the ratio of the lower
bound to the upper. This distribution is skewed
more heavily as lower approaches upper.
The solution is to sample a random value in the
range upper-lower, then add this to our lower
bound. The effect is a uniformly distributed
timeout in [lower, upper).
In this commit, we modify the existing logic that would attempt to read
the min CLTV information from the invoice directly. With this route, we
avoid any sort of DB index modifications, as this information is already
stored within the payment request, which is already available to the
outside callers. By modifying the InvoiceDatabase interface, we avoid
having to make the switch aware of what the "primary" chain is.
This commit fixes a bug in the
TestDecayedLogPersistentGarbageCollector unit test.
The test generates a second hash prefix, which is never
added to the log, and used to query for the final
existence check. This commit reverts the behavior so
that the same hash prefix is used throughout the test.
In this commit, we fix a lingering bug within the link when we're the
exit node for a particular payment. Before this commit, we would assert
that the invoice gives us enough of a delta based on our current routing
policy. However, if the invoice was generated with a lower delta, or
we've changed from the default routing policy, then this would case us
to fail back any payments sent to us.
We fix this by instead using the newly available final CLTV delta
information within the extracted invoice.
Fixes#1431.
In this commit, we extract the time lock policy verification logic from
the processRemoteAdds method to the HtlcSatifiesPolicy method. With this
change, we fix a lingering bug within the link: we'll no longer verify
time lock polices within the incoming link, instead we'll verify it at
forwarding time like we should. This is a bug left over from the switch
of what the CLTV delta denotes in the channel update message we made
within the spec sometime last year.
In this commit, we extend the existing HtlcSatifiesPolicy method to also
accept timelock and height information. This is required as an upcoming
commit will fix an existing bug in the forwarding logic wherein we use
the time lock policies of the incoming node rather than that of the
outgoing node.
In this commit, we fix a bug in the generateHops helper function. Before
this commit, it erroneously used the CLTV delta of the current hop,
rather than that of the prior hop when computing the payload. This was
incorrect, as when computing the timelock for the incoming hop, we need
to factor in the CTLV delta of the outgoing lock, not the incoming lock.
In this commit, we add a new test to the switch:
TestForwardingAsymmetricTimeLockPolicies. This test ensures that a link
has two channels, one of which has a greater CLTV delta than the latter,
that a payment will successfully be routed across the channels. Atm, the
test fails (including the fix to hop payload generation included in the
next commit).
Atm, due to the way that we check forwarding policies, we'll reject this
payment as we're attempting to enforce the policy of the incoming link
(cltv delta of 7), instead of that of the outgoing link (cltv delta of
6). As a result, atm, the incoming link checks if (incoming_timeout -
delta < outgoing_timeout). For the values in the test case: 112 - 7 <
106 -> 105 < 106, this check fails. The payload is proper, but the check
itself should be applied at the outgoing hop.
In this commit, we modify the removeLink method to be more asynchronous.
Before this commit, we would attempt to block until the peer exits.
However, it may be the case that at times time, then target link is
attempting to forward a batch of packets to the switch (forwardBatch).
Atm, this method doesn't pass in an external context/quit, so we can't
cancel this message/request. As a result, we'll now ensure that
`removeLink` doesn't block, so we can resume the switch's main loop as
soon as possible.
In this commit, we move the block height dependency from the links in
the switch to the switch itself. This is possible due to a recent change
on the links no longer depending on the block height to update their
commitment fees.
We'll now only have the switch be alerted of new blocks coming in and
links will retrieve the height from it atomically.
In this commit, we modify the behavior of links updating their
commitment fees. Rather than attempting to update the commitment fee for
each link every time a new block comes in, we'll use a timer with a
random interval between 10 and 60 minutes for each link to determine
when to update their corresponding commitment fee. This prevents us from
oscillating the fee rate for our various commitment transactions.
In this commit, we fix a bug in the way we handle removing items from
the interfaceIndex. Before this commit, we would delete all items items
with the target public key that of the peer that owns the link being
removed. However, this is incorrect as the peer may have other links
sill active.
In this commit, we fix this by first only deleting the link from the
peer's index, and then checking to see if the index is empty after this
deletion. Only if so do we delete the index for the peer all together.
In this commit, we modify the interfaceIndex to no longer key the second
level of the index by the ChannelLink. Instead, we'll use the chan ID as
it's a stable identifier, unlike a reference to an interface.
In this commit, we ensure that the packet queue will always exit, by
continually signalling the main goroutine until it atomically sets a
bool that indicates its has been fully shutdown. It has been observed
that at times the main goroutine will wake up (due to the signal), but
then bypass the select and actually miss the quit signal, as a result
another signal is required. We'll continue to signals in a lazy loop
until the goroutine has fully exited.
This commit removes a possible deadlock in the switch,
which can be triggered under certain failure conditions.
Previously, we would acquire the link's read lock for
the duration of HtlcSatisfiesPolicy, though we only
need to use it grab the current policy. The deadlock could
be caused in the cases where we attempt to log the failure,
which access the read-protected ShortChanID method.
In this commit, we add and enforce a min fee rate for commitment
transactions created, and also any updates we propose to the remote
party. It's important to note that this is only a temporary patch, as
nodes can dynamically raise their min fee rate whenever their mempool is
saturated.
Fixes#1330.
This commit changes the decayed log tests to create
a new temporary database for each test. Previously, all
instances referenced the same db path. Since the tests
are run in parallel, the tests would create/delete the
shared db out from under each other, causing flakes in
the unit tests.
Adds a new closure OnChannelFailure to the link config, which is called
when the link fails. This function closure should use the given
LinkFailureError to properly force close the channel, send an error to
the peer, and disconnect the peer.
This commit introduces a new error type LinkFailureError which is used
to distinguish the different kinds of errors that we can encounter
during link operation. It encapsulates the information necessary to
decide how we should handle the error.
In this commit, we fix an existing source of a panic, that could at
times lead to a deadlock. If the circuit returned from closeCircuit
didn't have an outgoing key (as it was an incomplete forward), then we
would attempt to de-ref a nil pointer. This would trigger a panic, and
the runtime would start to unwind the stack, and execute each defer in
line. A deadlock can arise here, as in the defer at the root goroutine,
we need to grab the fwdingEventMtx. However, we already have it at the
panic site.
We fix this issue by ensuring we only attempt to add the event if it's a
_settle_ and also actually has an outgoing circuit (which it should
already, just a defensive check).
This commit adds a test where we trigger a situation which would
previously make the link think it was never in sync, and potentially
create a lot of empty state updates. This would happen if we were
waiting for a revocation, while still receiving updates from the remote.
Since in this case we could not ACK the updates because of the exhausted
revocation window, our local commitchain would extend, while the remote
chain would stall. When we finally got the revocation the local
commitment height would be far larger than the remote, and FullySynced
would return false from that point on.
This commit adds a new test that makes sure we don't try to send
commitments for states where there are now new updates. Before the
recent change to FullySynced we would do this in this test scenario, as
the local an remote commitment heights would differ.
The test makes the local commitment chain extend by 1 vs the remote,
which would earlier trigger another state update, and checks taht no
such state update is made.
This commit makes the call to forwardBatch after locking
in Adds synchronous. This ensures that circuits for any Add
packets are added to the switch in the same order that they
are prescribed in the channel state. Though it is very unlikely
this case would arise, it may happen under more greater loads.
In addition, this also makes some trivial optimizations wrt. to
not spawning unnecessary goroutines if no settle/fail packets
are locked in.
In this commit, we ensure that any time we send a TempChannelFailure
that's destined for a multi-hop source sender, then we'll always package
the latest channel update along with it.
In this commit, we fix a race in the set of TestChannelLinkTrimCircuits*
tests. Before this commit, we would trim the circuits in the htlcManager
goroutine. However, this was problematic as the scheduling order of
goroutines isn't predictable. Instead, we'll now trim the circuits in
the Start method.
Additionally, we fix a series of off-by-2 bugs in the tests themselves.
This commit inserts an initial set of HodlFlags into
their correct places within the switch. In lieu of the
existing HtlcHodl mode, it is been replaced with a
configurable HodlMask, which is a bitvector representing
the desired breakpoints. This will allow for fine grained
testing of the switch's internals, since we can create
arbitrary delays inside a otherwise asynchronous system.
In this commit, we update the testUpdateChannelPolicy to exercise the
recent set of changes within the switch. If one applies this test to a
fresh branch (without those new changes) it should fail. This is due to
the fact that before, Bob would attempt to apply the constraints of the
incoming link (which we updated) instead of the outgoing link. With the
recent set of changes, the test now properly passes.
In this commit, we fix the TestUpdateForwardingPolicy test case after
the recent modification in the way we handling validating constraints
within the link. After the recent set of changes, Bob will properly use
his outgoing link to validate the set of fee related constraints rather
than the incoming link. As a result, we need to modify the second
channel link, not the first for the test to still be applicable.
In this commit, we simplify the switch's code a bit. Rather than having
a set of channels we use to mutate or query for the set of current
links, we'll instead now just use a mutex to guard a set of link
indexes. This serves to simplify the ode, and also make it such that we
don't need to block forwarding in order to add/remove a link.
In this commit, we fix a very old, lingering bug within the link. When
accepting an HTLC we are meant to validate the fee against the
constraints of the *outgoing* link. This is due to the fact that we're
offering a payment transit service on our outgoing link. Before this
commit, we would use the policies of the *incoming* link. This would at
times lead to odd routing errors as we would go to route, get an error
update and then route again, repeating the process.
With this commit, we'll properly use the incoming link for timelock
related constraints, and the outgoing link for fee related constraints.
We do this by introducing a new HtlcSatisfiesPolicy method in the link.
This method should return a non-nil error if the link can carry the HTLC
as it satisfies its current forwarding policy. We'll use this method now
at *forwarding* time to ensure that we only forward to links that
actually accept the policy. This fixes a number of bugs that existed
before that could result in a link accepting an HTLC that actually
violated its policy. In the case that the policy is violated for *all*
links, we take care to return the error returned by the *target* link so
the caller can update their sending accordingly.
In this commit, we also remove the prior linkControl channel in the
channelLink. Instead, of sending a message to update the internal link
policy, we'll use a mutex in place. This simplifies the code, and also
adds some necessary refactoring in anticipation of the next follow up
commit.
In this commit, we fix a slight bug in lnd. Before this commit, we would
send the error to the remote peer, but in an async manner. As a result,
it was possible for the connections to be closed _before_ the error
actually reached the remote party. The fix is simple: wait for the error
to be returned when sending the message. This ensures that the error
reaches the remote party before we kill the connection.
In this commit, add a new argument to the SendMessage method to allow
callers to request that the method block until the message has been sent
on the socket to the remote peer.