This commit moves the logic handling responses to
locally-initiated payments to be asynchronous. The
reordering of operations into handleLocalDispatch
brings a serious performance burden to the switch's
main event loop. However, the at-most once semantics
of circuit map and idempotency of cleanup methods
allows concurrent operations to run in parallel.
Prior to this commit, the async_payments_benchmark
would timeout due to the forcibly serial nature of
the prior design. With this change, there is no
perceptible difference in the benchmark OMM, even
though we've added two extra db calls.
Composes the new payment status helper methods such that
we only require one db txn per state transition. This
also allows us to remove the exclusive lock from the
control tower, and enable more concurrent requests.
In this commit, we address an issue that could arise when using the
SendToRoute RPC. In this RPC, we specify the exact hops that a payment
should take. However, within the switch, we would set a constraint for
the first hop to be any hop as long as the first peer was at the end of
it. This would cause discrepancies when attempting to use the RPC as the
payment would actually go through another hop with the same peer. We fix
this by explicitly specifying the channel ID of the first hop.
Fixes#1500.
Fixes#1515.
In this commit, we modify the readHandler w/in
the mock peer to drop messages if it is unable
to find the target link. This has led to observed
race conditions related to removing a link and still
attempting to deliver messages. By removing this,
the readHandler shouldn't fail the test as a result.
This commit increases the fwdpkg garbage collection
interval to 15s, to mitigate the likelihood of it
interfering with our unit tests related to fwdpkgs.
This commit removes the concept of "circuit deletion
forgivness" from the link. This was originally
implemented due to the strict semantics of the original
DeleteCircuit implementation, which would fail if we tried
to delete unknown circuits. Forgivness is used on startup
to ignore this error in case the circuits had already been
deleted before shutting down.
Now that the circuit deletion has been relaxed, this
behavior is no longer necessary, as requests to delete
unknown (or previously deleted) circuits will be ignored.
This is necessary for future changes regarding switch
cleanup, which may attempt to cleanup already deleted
circuits.
Previously, we would only allow deletion of circuits if all circuit keys
were found in the pending map.
In this commit, we relax this to allow for deletion of any circuits
that are found pending, and ignore those that are not found. This
is a preliminary step to cleaning up duplicate forwards that get caught
by the switch. It also allows us to gracefully handle any nodes that
are still afflicted by the split mailbox issue.
Replaces the log statement in CommitCircuits so that
it prints the circuit key of the incoming channel. This way
we avoid spewing the secp curve stored in the ErrorEncrypter.
In this commit, we thread through a link's quit channel into
routeAsync, the primary helper method allowing links to send
htlcPackets through the switch. This is intended to remove
deadlocks from happening, where the link is synchronously
blocking on forwarding packets to the switch, but also
needs to shutdown.
This commit adds a test that verifies Stop does not block
if the link is concurrently forwarding incoming Adds to
the switch. This test fails prior to the commits that
thread through the link's quit channel.
This commit modifies the default BatchTicker
implementation such that it will generate a
new ticker with each call to Start(). This
allows us to create a new ticker after
releasing an old one due to the batch
being empty.
In this commit, we prevent the htlcManager from
being woken up by the batchTicker when there is no
work to be done. Profiling has shown a significant
portion of CPU time idling, since the batch ticker
endlessly demands resources. We resolve this by only
selecting on the batch ticker when we have a
non-empty batch of downstream packets from the
switch.
This commit corrects our exit hop logic to return
FailFinalExpiryTooSoon if the following check is true:
pd.Timeout-expiryGraceDelta <= heightNow
Previously we returned FailFinalIncorrectCltvExpiry, which
should only be returned if the packet was misconstructed.
This commit replaces the debug Config struct with an empty
one, so that the command line flags are hidden in production
builds.
Production help before commit:
Tor:
--tor.active
--tor.socks=
--tor.dns=
--tor.streamisolation
--tor.control=
--tor.v2
--tor.v2privatekeypath=
--tor.v3
hodl:
--hodl.exit-settle
--hodl.add-incoming
--hodl.settle-incoming
--hodl.fail-incoming
--hodl.add-outgoing
--hodl.settle-outgoing
--hodl.fail-outgoing
--hodl.commit
--hodl.bogus-settle
Help Options:
-h, --help
Production help after commit:
Tor:
--tor.active
--tor.socks=
--tor.dns=
--tor.streamisolation
--tor.control=
--tor.v2
--tor.v2privatekeypath=
--tor.v3
Help Options:
-h, --help
In this commit, we modify the InvoiceDatabase slightly to allow the link
to record what the final payment about for an invoice was. It may be the
case that the invoice actually had no specified value, or that the payer
paid more than necessary. As a result, it's important that our on-disk
records properly reflect this.
To fix this issue, the SettleInvoice method now also accepts the final
amount paid.
Fixes#856.
This commit corrects the distribution used to
schedule a link's randomized backoff for fee
updates. Currently, our algorithm biases the
lowest value in the range, with probability
equal to lower/upper, or the ratio of the lower
bound to the upper. This distribution is skewed
more heavily as lower approaches upper.
The solution is to sample a random value in the
range upper-lower, then add this to our lower
bound. The effect is a uniformly distributed
timeout in [lower, upper).
In this commit, we modify the existing logic that would attempt to read
the min CLTV information from the invoice directly. With this route, we
avoid any sort of DB index modifications, as this information is already
stored within the payment request, which is already available to the
outside callers. By modifying the InvoiceDatabase interface, we avoid
having to make the switch aware of what the "primary" chain is.
This commit fixes a bug in the
TestDecayedLogPersistentGarbageCollector unit test.
The test generates a second hash prefix, which is never
added to the log, and used to query for the final
existence check. This commit reverts the behavior so
that the same hash prefix is used throughout the test.
In this commit, we fix a lingering bug within the link when we're the
exit node for a particular payment. Before this commit, we would assert
that the invoice gives us enough of a delta based on our current routing
policy. However, if the invoice was generated with a lower delta, or
we've changed from the default routing policy, then this would case us
to fail back any payments sent to us.
We fix this by instead using the newly available final CLTV delta
information within the extracted invoice.
Fixes#1431.
In this commit, we extract the time lock policy verification logic from
the processRemoteAdds method to the HtlcSatifiesPolicy method. With this
change, we fix a lingering bug within the link: we'll no longer verify
time lock polices within the incoming link, instead we'll verify it at
forwarding time like we should. This is a bug left over from the switch
of what the CLTV delta denotes in the channel update message we made
within the spec sometime last year.
In this commit, we extend the existing HtlcSatifiesPolicy method to also
accept timelock and height information. This is required as an upcoming
commit will fix an existing bug in the forwarding logic wherein we use
the time lock policies of the incoming node rather than that of the
outgoing node.
In this commit, we fix a bug in the generateHops helper function. Before
this commit, it erroneously used the CLTV delta of the current hop,
rather than that of the prior hop when computing the payload. This was
incorrect, as when computing the timelock for the incoming hop, we need
to factor in the CTLV delta of the outgoing lock, not the incoming lock.
In this commit, we add a new test to the switch:
TestForwardingAsymmetricTimeLockPolicies. This test ensures that a link
has two channels, one of which has a greater CLTV delta than the latter,
that a payment will successfully be routed across the channels. Atm, the
test fails (including the fix to hop payload generation included in the
next commit).
Atm, due to the way that we check forwarding policies, we'll reject this
payment as we're attempting to enforce the policy of the incoming link
(cltv delta of 7), instead of that of the outgoing link (cltv delta of
6). As a result, atm, the incoming link checks if (incoming_timeout -
delta < outgoing_timeout). For the values in the test case: 112 - 7 <
106 -> 105 < 106, this check fails. The payload is proper, but the check
itself should be applied at the outgoing hop.
In this commit, we modify the removeLink method to be more asynchronous.
Before this commit, we would attempt to block until the peer exits.
However, it may be the case that at times time, then target link is
attempting to forward a batch of packets to the switch (forwardBatch).
Atm, this method doesn't pass in an external context/quit, so we can't
cancel this message/request. As a result, we'll now ensure that
`removeLink` doesn't block, so we can resume the switch's main loop as
soon as possible.
In this commit, we move the block height dependency from the links in
the switch to the switch itself. This is possible due to a recent change
on the links no longer depending on the block height to update their
commitment fees.
We'll now only have the switch be alerted of new blocks coming in and
links will retrieve the height from it atomically.
In this commit, we modify the behavior of links updating their
commitment fees. Rather than attempting to update the commitment fee for
each link every time a new block comes in, we'll use a timer with a
random interval between 10 and 60 minutes for each link to determine
when to update their corresponding commitment fee. This prevents us from
oscillating the fee rate for our various commitment transactions.
In this commit, we fix a bug in the way we handle removing items from
the interfaceIndex. Before this commit, we would delete all items items
with the target public key that of the peer that owns the link being
removed. However, this is incorrect as the peer may have other links
sill active.
In this commit, we fix this by first only deleting the link from the
peer's index, and then checking to see if the index is empty after this
deletion. Only if so do we delete the index for the peer all together.
In this commit, we modify the interfaceIndex to no longer key the second
level of the index by the ChannelLink. Instead, we'll use the chan ID as
it's a stable identifier, unlike a reference to an interface.
In this commit, we ensure that the packet queue will always exit, by
continually signalling the main goroutine until it atomically sets a
bool that indicates its has been fully shutdown. It has been observed
that at times the main goroutine will wake up (due to the signal), but
then bypass the select and actually miss the quit signal, as a result
another signal is required. We'll continue to signals in a lazy loop
until the goroutine has fully exited.
This commit removes a possible deadlock in the switch,
which can be triggered under certain failure conditions.
Previously, we would acquire the link's read lock for
the duration of HtlcSatisfiesPolicy, though we only
need to use it grab the current policy. The deadlock could
be caused in the cases where we attempt to log the failure,
which access the read-protected ShortChanID method.
In this commit, we add and enforce a min fee rate for commitment
transactions created, and also any updates we propose to the remote
party. It's important to note that this is only a temporary patch, as
nodes can dynamically raise their min fee rate whenever their mempool is
saturated.
Fixes#1330.
This commit changes the decayed log tests to create
a new temporary database for each test. Previously, all
instances referenced the same db path. Since the tests
are run in parallel, the tests would create/delete the
shared db out from under each other, causing flakes in
the unit tests.
Adds a new closure OnChannelFailure to the link config, which is called
when the link fails. This function closure should use the given
LinkFailureError to properly force close the channel, send an error to
the peer, and disconnect the peer.
This commit introduces a new error type LinkFailureError which is used
to distinguish the different kinds of errors that we can encounter
during link operation. It encapsulates the information necessary to
decide how we should handle the error.
In this commit, we fix an existing source of a panic, that could at
times lead to a deadlock. If the circuit returned from closeCircuit
didn't have an outgoing key (as it was an incomplete forward), then we
would attempt to de-ref a nil pointer. This would trigger a panic, and
the runtime would start to unwind the stack, and execute each defer in
line. A deadlock can arise here, as in the defer at the root goroutine,
we need to grab the fwdingEventMtx. However, we already have it at the
panic site.
We fix this issue by ensuring we only attempt to add the event if it's a
_settle_ and also actually has an outgoing circuit (which it should
already, just a defensive check).
This commit adds a test where we trigger a situation which would
previously make the link think it was never in sync, and potentially
create a lot of empty state updates. This would happen if we were
waiting for a revocation, while still receiving updates from the remote.
Since in this case we could not ACK the updates because of the exhausted
revocation window, our local commitchain would extend, while the remote
chain would stall. When we finally got the revocation the local
commitment height would be far larger than the remote, and FullySynced
would return false from that point on.
This commit adds a new test that makes sure we don't try to send
commitments for states where there are now new updates. Before the
recent change to FullySynced we would do this in this test scenario, as
the local an remote commitment heights would differ.
The test makes the local commitment chain extend by 1 vs the remote,
which would earlier trigger another state update, and checks taht no
such state update is made.
This commit makes the call to forwardBatch after locking
in Adds synchronous. This ensures that circuits for any Add
packets are added to the switch in the same order that they
are prescribed in the channel state. Though it is very unlikely
this case would arise, it may happen under more greater loads.
In addition, this also makes some trivial optimizations wrt. to
not spawning unnecessary goroutines if no settle/fail packets
are locked in.
In this commit, we ensure that any time we send a TempChannelFailure
that's destined for a multi-hop source sender, then we'll always package
the latest channel update along with it.
In this commit, we fix a race in the set of TestChannelLinkTrimCircuits*
tests. Before this commit, we would trim the circuits in the htlcManager
goroutine. However, this was problematic as the scheduling order of
goroutines isn't predictable. Instead, we'll now trim the circuits in
the Start method.
Additionally, we fix a series of off-by-2 bugs in the tests themselves.
This commit inserts an initial set of HodlFlags into
their correct places within the switch. In lieu of the
existing HtlcHodl mode, it is been replaced with a
configurable HodlMask, which is a bitvector representing
the desired breakpoints. This will allow for fine grained
testing of the switch's internals, since we can create
arbitrary delays inside a otherwise asynchronous system.
In this commit, we update the testUpdateChannelPolicy to exercise the
recent set of changes within the switch. If one applies this test to a
fresh branch (without those new changes) it should fail. This is due to
the fact that before, Bob would attempt to apply the constraints of the
incoming link (which we updated) instead of the outgoing link. With the
recent set of changes, the test now properly passes.
In this commit, we fix the TestUpdateForwardingPolicy test case after
the recent modification in the way we handling validating constraints
within the link. After the recent set of changes, Bob will properly use
his outgoing link to validate the set of fee related constraints rather
than the incoming link. As a result, we need to modify the second
channel link, not the first for the test to still be applicable.
In this commit, we simplify the switch's code a bit. Rather than having
a set of channels we use to mutate or query for the set of current
links, we'll instead now just use a mutex to guard a set of link
indexes. This serves to simplify the ode, and also make it such that we
don't need to block forwarding in order to add/remove a link.
In this commit, we fix a very old, lingering bug within the link. When
accepting an HTLC we are meant to validate the fee against the
constraints of the *outgoing* link. This is due to the fact that we're
offering a payment transit service on our outgoing link. Before this
commit, we would use the policies of the *incoming* link. This would at
times lead to odd routing errors as we would go to route, get an error
update and then route again, repeating the process.
With this commit, we'll properly use the incoming link for timelock
related constraints, and the outgoing link for fee related constraints.
We do this by introducing a new HtlcSatisfiesPolicy method in the link.
This method should return a non-nil error if the link can carry the HTLC
as it satisfies its current forwarding policy. We'll use this method now
at *forwarding* time to ensure that we only forward to links that
actually accept the policy. This fixes a number of bugs that existed
before that could result in a link accepting an HTLC that actually
violated its policy. In the case that the policy is violated for *all*
links, we take care to return the error returned by the *target* link so
the caller can update their sending accordingly.
In this commit, we also remove the prior linkControl channel in the
channelLink. Instead, of sending a message to update the internal link
policy, we'll use a mutex in place. This simplifies the code, and also
adds some necessary refactoring in anticipation of the next follow up
commit.
In this commit, we fix a slight bug in lnd. Before this commit, we would
send the error to the remote peer, but in an async manner. As a result,
it was possible for the connections to be closed _before_ the error
actually reached the remote party. The fix is simple: wait for the error
to be returned when sending the message. This ensures that the error
reaches the remote party before we kill the connection.
In this commit, add a new argument to the SendMessage method to allow
callers to request that the method block until the message has been sent
on the socket to the remote peer.
In this commit, we fix an issue where users would be displayed negative
amounts of satoshis either as sent or received. This can happen if the
total amount of channel updates decreases due to channels being closed.
To fix this, we properly handle a negative difference of channel
updates by updating the stats logged to only include active
channels/links to the switch.
In this commit, we relax the constraints on accepting an exit hop
payment a bit. We'll now accept any incoming payment that _at least_
pays the invoice amount. This puts us further inline with the
specification, which recommends that nodes accept overpayment by a
certain margin.
Fixes#1002.
In this commit, we remove a ton of unnecessary indentation in the
processRemoteAdds method. Before this commit, we had a switch statement
on the type of the entry. This was required before when the method was
generic, but now since we already know that it’s an Add, we no longer
require such a statement.
In this commit, we remove the DecodeHopIterator method from the
ChannelLinkConfig struct. We do this as we no longer use this method,
since we only ever use the DecodeHopIterators method now.
In this commit, we fix a bug that was uncovered by the recent change to
lnwire.MilliSatoshi. Rather than manually compute the diff in fees,
we’ll directly compare the fee that is given against the fee that we
expect.
In this commit, we extend the switch as is, to record details
concerning settled payment circuits. To do this, we introduce a new
interface to the package: the ForwardingLog. This is a tiny interface
that simply lets us abstract away the details of the storage backing of
the forwarding log.
Each time we receive a successful HTLC settle, we’ll log the full
details (chans, fees, time) as a pending forwarding log entry. Every 15
seconds, we’ll then batch flush out these entries to disk. When we’re
exiting, we’ll try to flush out all entries to ensure everything gets
recorded to disk.
We’ll need this value within the link+switch in order to fully populate
the forwarding event that will be generated if this HTLC circuit is
successfully completed.
In this commit, we add the incoming+outgoing amounts if the HTLC’s that
the payment circuit consists of. With these new fields, we’ll be able
to populate the forwarding event log once the payment circuit has been
successfully completed.
This commit fixes a deadlock scenario caused when some
switch methods are waiting for a response on the
command's done/err chan. However, no such response will
be delivered if the main event loop has already exited.
This is resolved by selecting on the command's done/err chan
and the server's quit chan simultaneously.
In this commit, we fix an existing bug that would result in some
payments getting “stuck”. This would happen if one side restarted
before the channel was fully locked in. In this case, since upon
re-connection, the link will get added to the switch with a *short
channel ID of zero*. If A then tries to make a multi-hop payment
through B, B will fail to forward the payment, as it’ll mistakenly
think that the payment originated from a local-subsystem as the channel
ID is zero. A short channel ID of zero is used to map local payments
back to their caller.
With fix this by allowing the funding manager to dynamically update the
short channel ID of a link after it discovers the short channel ID.
In this commit, we fix a second instance of reported “stuck” payments
by users.
This commit updates the tests for checking a links Bandwidth()
calculation, after the change that made us use the remoteACKedIndex
instead of the logIndex when calculating it. The main result of this
change is that we never consider incoming updates before they are
acked, when calculating the bandwidth. This is because this was
inconsistent with the state we actually end up signing later on.
This commit introduces a new Ticker interface, that can be used
to control when the batch timer should tick. This is done to be
able to more easily control the ticker during tests. The batch
timer is wrapped in the new BatchTicker struct, and made part
of the config together with BatchSize.
In this commit, we add 6 new integration tests to test the various
actions that may need to be performed when either side goes on-chain to
fully resolve HTLC’s. Many of the tests are mirrors of each other as
they test sweeping/resolving HTLC’s from both commitment transactions.
In this commit, we update the failure case within handleLocalDispatch
to handle locally sourced resolutions. This is the case that we send a
payment out, but before it can even get past the first hop, we need to
go to chain (may have been a cascading failure). Once the HTLC is fully
resolved, we’ll send back a resolution message, however, that message
doesn’t have a failure reason populated. To properly handle this, we’ll
send back a permanent channel failure to the router.
In this commit, we address a lingering TODO: before this if we had a
set of HTLC’s that we knew the pre-image to on our commitment
transaction after a restart, then we wouldn’t attempt to settle them.
With this new change, we’ll check that we didn’t already retransmit the
settles for them, and check the preimage cache to see if we already
know the preimage. If we do, then we’ll immediately settle them.
In this commit, we add some additional logic to the case when we
receive a pre-image from an upstream peer. We’ll immediately add it to
the witness cache, as an incoming HTLC might be waiting on-chain to
fully resolve the HTLC with knowledge of the newly discovered
pre-image.
In this commit, we add a new method: ProcessContractResolution. This
will be used by entities of the contract court package to notify us
whenever they discover that we can resolve an incoming contract
off-chain after the outgoing contract was fully resolved on-chain.
We’ll take a contractcourt.ResolutionMsg and map it to the proper
internal package so we can fully resolve an active circuit.
Before this commit, if the htlcManager unexpectedly exited (due to a
protocol error, etc), the underlying block epoch notification intent
that was created for it would never be cancelled. This would result in
tens, or hundreds of goroutine leaks as the client would never consume
those notifications.
To fix this, we move cancellation of the block epoch intent from the
Stop() method of the channel link, to the defer statement at the top of
the htlcManager.
In this commit, we add an additional case when handling a failed
commitment signature. If we detect that it’s a InvalidCommitSigError,
then we’ll send over an lnwire.Error message with the full details. We
don’t yet properly dispatch this error on the reciting side, but that
will be done in a follow up a commit.
In this commit, we fix a lingering protocol level bug when reporting
errors encountered during onion blob processing. The spec states that
if one sends an UpdateFailMalformedHtlc, then the error reason MUST
have the BadOnion bit set. Before this commit, we would return
CodeTemporaryChannelFailure. This is incorrect as this doesn’t have the
BadOnio bit set.
In this commit, we modify the way the link handles HTLC’s that it
detects is destined for itself. Before this commit if a payment hash
came across for an invoice we’d already settled, then we’d gladly
accept the payment _again_. As we’d like to enforce the norm that an
invoice is NEVER to be used twice, this commit modifies that behavior
to instead reject an incoming payment that attempts to re-use an
invoice.
Fixes#560.
This commit fixes a lingering bug that could at times cause
incompatibilities with other implementations when attempting a
cooperative channel close. Before this commit, we would use a pointer
to the funding txin everywhere. As a result, each time we made a new
state, or verified one, we would modify the sequence field of the main
txin of the commitment transaction. Due to this if we updated the
channel, then went to do a cooperative channel closure, the sequence of
the txin would still be set to the value we used as the state hint.
To remedy this, we now copy the txin each time when making the
commitment transaction, and also the cooperative closure transaction.
This avoids accidentally mutating the txin itself.
Fixes#502.
This simplifies the pending payment handling code because it allows it
be handled in nearly the same way as forwarded HTLCs by treating an
empty channel ID as local dispatch.
The src/dest terminology for routing packets is kind of confusing
because the source HTLC may not be the source of the packet for
settles/fails traversing the circuit in the opposite direction. This
changes the nomenclature to incoming/outgoing and always references
the HTLCs themselves.
Previously, some methods on a LightningChannel like SettleHTLC and
FailHTLC would identify HTLCs by payment hash. This would not always
work correctly if there are multiple HTLCs with the same payment hash,
so instead we change these methods to identify HTLCs by their unique
identifiers instead.
This changes the circuit map internals and API to reference circuits
by a primary key of (channel ID, HTLC ID) instead of paymnet
hash. This is because each circuit has a unique offered HTLC, but
there may be multiple circuits for a payment hash with different
source or destination channels.
The constructor functions have no additional logic other than passing
function parameters into struct fields. Given the large function
signatures, it is more clear to directly construct the htlcPacket in
client code than call a function with lots of positional arguments.
In this commit, we modify the existing logic to handle
UpdateFailMalformedHLTC message from an incoming peer. Rather than fail
the Chanel if they give us an invalid failure code, we’ll instead treat
it as a temporary channel failure so we can continue to forward the
error.
This commit is a follow up to a prior commit which skipped sending the
commitment sig message (and sending out the update fee) message if the
channel wasn’t yet able to forward any HTLC’s. We’ll modify the prior
commit to not add the fee update to the channel at all. Otherwise, we
risk a state desynchronization.
This commit adds a check to `updateChannelFee` which skipssending the
`update_fee` message when the channel is not eligable for forwarding
messages (likely due to the channel's `RemoteNextRevocation` not yet
being set).
This addresses #470.
This commit fixes an existing bug wherein we would incorrectly attempt
to forward and HTLC to a link that wasn’t yet eligible for forwarding.
This would occur when we’ve added a link to the switch, but haven’t yet
received a FundingLocked message for the channel. As a result, the
channel won’t have the next revocation point available. A logic error
prior to this commit would skip tallying the largest bandwidth rather
than skipping examining the link all together.
Fixes#464.
In this commit, when selecting a candidate link to forward a payment,
we’ll ensure that it’s actually able to take on the HTLC. Otherwise,
we’ll skip over the link itself. Currently, a link is only fully
eligible for forwarding, *after* we’ve received and fully processed the
FundingLocked message.
In this commit, we add a new method to the ChanneLink interface:
EligibleToForward. This method allows a link to be added to the switch,
but in an intermediate state which indicates that it isn’t yet ready to
forward any incoming HTLC’s.
In this commit we add a new case to the main select statement within a
channel link. This select statement will serve as a Sipping Bird which
will check the network fee rate (as returned by the fee estimator) and
compare that to the fee on the commitment transaction. Using the
shouldAdjustCommitFee function, we determine if we should update the
commitment fee. If so, then we’ll send an UpdateFee message and also
trigger a new commitment update.
We also add a new unit test: TestChannelLinkUpdateCommitFee to ensure
that we update the fee accordingly if the fee increases or decreases by
a large portion.
In this commit, we add a new helper function to the link which will be
utilized in a later commit. This helper function will help us determine
if we should update the commitment fee, in response to a change in the
network fee return by our fee estimators.
In this commit we modify the primary InvoiceRegistry interface within
the package to instead return a direct value for LookupInvoice rather
than a pointer. This fixes an existing race condition wherein a caller
could modify or read the value of the returned invoice.
In this commit we add a quit case to the select statement that’s
entered once a link is created. Before this commit, upon restart it
would be possible that the deamon would never ben able to shutdown as
the link would be waiting for the messages to be sent by the other
side.
In this commit, we update getChanID to be aware of the FundingLocked
message as it will be retransmitted upon reconnect if both nodes think
that they’re at the very first commitment state.
In this commit, we’ve re-written the process of syncing the state of
channels after we reconnect. This re-write ensure correctness, and also
simplified the existing logic which would attempt to launch another
goroutine to handle requests from the switch to ensure that it doesn’t
block. This is no longer necessary as the AddPacket method that the
switch indirectly calls is non-blocking.
In this commit, we modify the existing implementation of the
Bandwidth() method on the default ChannelLink implementation to use
much tighter accounting. Before this commit, there was a bug wherein if
the link restarted with pending un-settled HTLC’s, and one of them was
settled, then the bandwidth wouldn’t properly be updated to reflect
this fact.
To fix this, we’ve done away with the manual accounting and instead
grab the current balances from two sources: the set of active HTLC’s
within the overflow queue, and the report from the link itself which
includes the pending HTLC’s and factors in the amount we’d need to (or
not need to) pay in fees for each HTLC.
In this commit, we’ve modified the link and the switch to start to use
the new mailBox in place of the existing synchronous message send
directly into the link’s upstream/downstream channels. With his change,
we no longer need to spawn a new goroutine each time an HTLC needs to
be forwarded, or a user payment is initiated.
In this commit, we add a new abstraction to the package: the mailBox.
The mailBox is a non-blocking, concurrent safe, in-order queue for
delivering messages to a given channelLink instance. With this
abstraction in place, we can now allow the switch to no longer launch a
new goroutine for each forwarded HTLC, or instantiated user payment.
After addition of the retransmission logic in the channel link, we
should make the onion blobs persistant, the proper way to do this is
include the onion blobs in the payment descriptor rather than storing
them in the distinct struct in the channel link.
In this commit BOLT№2 retranmission logic for the channel link have
been added. Now if channel link have been initialised with the
'SyncState' field than it will send the lnwire.ChannelReestablish
message and will be waiting for receiving the same message from remote
side. Exchange of this message allow both sides understand which
updates they should exchange with each other in order sync their
states.
In order to be able to properly restart switch several times we should
have the sequential process of channel link stop. In other words if we
stopped the switch we should be sure that all channel links have been
stopped too. Addition of the goroutine during the force close was added
because of the deadlock:
Trace:
1. link:force_close_notification
2. link:wipe_channel
3. peer:switch_remove_link
4. switch:stop_link
5. link:wait <-- deadlock
This commit where added as a measure to avoid the panic during several
server simultanoius fault. The panic happened becuase *t.Testing
structure is not concurrent safe.
In this commit, we address a lingering TODO within the
TestUpdateForwardingPolicy test case to ensure that Bob will reject the
payment the second time around due to an update in his fee policy.
In this commit, we update the TestLinkForwardTimelockPolicyMismatch to
instead _subtract_ time from the first HTLC extended to the initial
hop. We now subtract instead as giving intermediate hops more time
is.now permitted.
In this commit, we relax the time lock verification when we realize
we’re an intermediate hop. We no longer directly assert that the time
lock we receive is _identical_, instead we allow slow slack and will
reject iff, the incoming timelock minus the outgoing time lock doesn’t
meet our delta requirements.
This commit modifies the errors that we return within the
handleLocalDispatch method. Rather than returning a regular error, or
simply the matching error code in some instances, we now _always_
return an instance of ForwardingError. This will allow the router to
make more intelligent decisions w.r.t routing HTLC’s as with this
information it will now be able to differentiate errors that occur
within the switch (before sending out the HTLC), from errors that occur
within the HTLC route itself.
This commit adds a new field to the switch’s Config, namely the public
key of the backing lightning node. This field will soon be used to
return more detailed errors messages back to the ChannelRouter itself.
This commit adds a new field to the ForwardingError struct: ExtraMsg.
The purpose of this field is to allow the htlcswitch to tack on
additional error context to ForwardingError messages returned to the L3
router.
This commit renames the Deobfuscator interface to ErrorDecrypter and
the Obfuscator interface to ErrorEncrypter. With this rename, the
purpose of these two interfaces are a bit clearer.
Additionally, DecryptError (which was formerly Deobfuscate) now
directly returns an ForwardingError type instead of the
lnwire.FailureMessage.
This commit introduces a new type to the package: ForwardingError. It
wraps an existing lnwire.FailureMessage interface, and also includes
the _source_ of the error message. By including the source of the
message, the router can now prune the set of available routes down in
order to reduce the number of subsequent failures based on the source
of the error and the type of the error itself.
This commit fixes an existing bug, wherein if we failed to account for
the fact that if we we’re unable to add an HTLC for any reason other
than an overflown commitment transaction, then we wouldn’t properly
re-add the available bandwidth of the offending HTLC.
This commit modifies the error we return to the end user in the case of
an insufficient link capacity error when handling a local payment
dispatch. Previously we would return a
lnwire.CodeTemporaryChannelFailure, however, this isn’t necessary as
this is a local payment attempt and we don’t give up any sensitive
information by returning the best available bandwidth, and what we need
to complete the payment.
In the case where the channelLink get started and the number of
updates on this channel is zero, this means no paymenys has been
done using this channel. This might mean that the fundingLocked
never was sent successfully, so we resend to make sure this
channel gets opened correctly.
This commit fixes a bug related to swallowing an error that should go
to the switch in the case of an insufficient balance error when
attempting to add a new HTLC to the channel state machine. In this
case, an error would never be returned back to the client/switch, and
the internal processing within the channelLink would loop forever,
attempting to add an HTLC that can’t be added due to insufficient
balance to state machine itself.
We fix this issue by only treating the lnwallet.ErrMaxHTLCNumber as the
only error that prompts adding an HTLC to the overflow queue rather
than sending the error directly back to the switch.
This commit fixes a possible deadlock within the packetQueue that could
be caused by the following circular waiting dependency:
packetCoordinator woken up, grabs lock, queue isn’t empty, attempts to
send packet to link (lock still held) -> channelLink has commitment
overflow, attempts to add new item to packet queue, in AddPkt grabs
Lock -> circular wait.
We avoid this scenario by *not* holding the lock within the
packetCoordinator when we attempt to send a new packet to the switch.
Instead, we release the lock before the second select statement in the
main processing loop.
This commit adds a new test case for the default implementation of the
ChannelLink to ensure that the bandwidth is updated properly in the
face of commitment transaction overflows, and the subsequent draining
of said overflown commitment transaction.
This commit adds a new test for the current default ChannelLink
implementation to ensure that the bandwidth updates for a link are
externally consistent from the PoV of callers after a modifying action.
In this commit, we’ve moved away from the internal queryHandler within
the packetQueue entirely. We now use an internal queueLen variable
internally to allow callers to sample the queue’s size, and also for
synchronization purposes internally.
This commit also introduces a chan struct{} (freeSlots) that is used
internally as a semaphore. The current value of freeSlots reflects the
number of available slots within the commitment transaction. Within the
link, after an HTLC has been removed/modified, then a “slot” is freed
up. The main packetConsumer then interprets these messages as a signal
to attempt to free up a new slot within the queue itself by dumping off
to the commitment transaction.
This commit removes the internal queryHandler within the packetQueue
itself in order to make way for an upcoming commit which uses atomic
variables to report the length of the queue to outside callers.
Additionally, due to the recent change within the channeling, we no
longer need to report the total value of all pending HTLC’s to the
outside world.
This commit modifies the way the bandwidth of a given channel link is
tracked, and reported externally. The prior approach pushed most of the
logic for tracking channel bandwidth into the link itself, and relied
on a report from the queue in order to determine the total available
bandwidth. This approach at times could inadvertently introduce
deadlocks when working on new features as since the query was handled
internally, it required the link to be _active_ and non-blocked in
order to respond to.
We’ve now abandoned this approach in favor of lifting the bandwidth
accounting to the highest possible abstraction layer within the link
itself. We now maintain a availableBandwidth integer that’s used
atomically within the link in response to: us adding+settling an HTLC,
and the remote party failing one of our HTLC’s.
This commit completes a full re-write of the link’s packet overflow
queue with the goals of the making the code itself more understandable
and also allowing it to be more extensible in the future with various
algorithms for handling HTLC congestion avoidance and persistent queue
back pressure.
The new design is simpler and consumes much less coroutines (no longer
a new goroutine for each active HLTC). We now implement a simple
synchronized queue using a standard condition variable.
This commit adds a new debug mode for lnd
called hodlhtlc. This mode instructs a node
to refrain from settling incoming HTLCs for
which it is the exit node. We plan to use
this in testing to more precisely control
the states a node can take during
execution.
This commit fixes an existing bug in the way we perform validation of
the timelock information as the final hop in the route. Previously, we
would assert that the outgoing time lock in the per-hop payload would
exactly match our time lock delta.
Instead, we should be asserting two things:
1. That the time lock in the payload is >= the expected time lock
2. That timeout on the HTLC is exactly equal to the payload
This commit adds a new method to the HtlcSwitch:
UpdateForwardingPolicies. With this method callers are now able to
modify the forwarding policies of all, or some currently active links.
We also make a slight modification to the way that forwarding policy
updates are handled within the links themselves to ensure that we don’t
override with a zero value for any of the fields.
This commit modifies how the htlcswitch handles close requests.
Previously it could be the case that a new channel was added, but at
the same time a channel was requested to be closed. This would result
in a circular waiting dependency: the peer contacts the switch, who
tries to contact the peer.
We eliminate this possibility by ensuring that the switch handles all
close requests asynchronously. With this, the switch won't block
indefinitely in the scenario described above.
This commit implements a missing policy within the current ChannelLink
interface. If an HTLC arrives that is too close to the current block
height, then we’ll reject it. As otherwise, it may be possible for us
to lose an on-chain claim if they HTLC expires already or expires
before we’re able to get a commitment transaction in the chain.
As the exit node, we have a grace period that governs out decision. As
an intermediate node, we ensure that the HTLC isn’t close to expiry on
our outgoing link end if we forward it.
This commit temporary increases the timeout for the
TestChannelLinkBidirectionalOneHopPayments test in order to account for
the slowness of the travis instances that our tests are run on.
This commit modifies the TestChannelLinkBidirectionalOneHopPayments
test to ensure that each payment sent is safely above the dust
threshold. Note that the dust threshold itself is now higher due to the
existence of the HTLC covenant transactions which the HTLC values
themselves must cover.
This change ensure that this test operates under “normal” operation
conditions in order to catch any bugs introduced during a major change.
We can safely remove the initial revocation window extension as this
has gone away with the new state machine. We instead now just fill the
window once the channel has been opened, and then maintain a fixed
window size of 2 from there on.
In previous commits we have intoduced the onion errors. Some of this
errors include lnwire.ChannelUpdate message. In order to change
topology accordingly to the received error, from nodes where failure
have occured, we have to propogate the update to the router subsystem.
Within the network, it's important that when an HTLC forwarding failure
occurs, the recipient is notified in a timely manner in order to ensure
that errors are graceful and not unknown. For that reason with
accordance to BOLT №4 onion failure obfuscation have been added.
This commit fixes a regression introduce in the prior commit which
added full verification of the per-hop payloads to the ChannelLink
interface. When this was initially implemented, the added checks
weren’t guarded on the existence of debughtlc’s. As a result,
debughtlc’s would be rejected as they don’t match the expected invoice
value.
This commit fixes that issue by only checking the hop payload if debug
HTLC mode isn’t on.
The btclog package has been changed to defining its own logging
interface (rather than seelog's) and provides a default implementation
for callers to use.
There are two primary advantages to the new logger implementation.
First, all log messages are created before the call returns. Compared
to seelog, this prevents data races when mutable variables are logged.
Second, the new logger does not implement any kind of artifical rate
limiting (what seelog refers to as "adaptive logging"). Log messages
are outputted as soon as possible and the application will appear to
perform much better when watching standard output.
Because log rotation is not a feature of the btclog logging
implementation, it is handled by the main package by importing a file
rotation package that provides an io.Reader interface for creating
output to a rotating file output. The rotator has been configured
with the same defaults that btcd previously used in the seelog config
(10MB file limits with maximum of 3 rolls) but now compresses newly
created roll files. Due to the high compressibility of log text, the
compressed files typically reduce to around 15-30% of the original
10MB file.
This commit adds a new method to the ChannelLink interface which is
meant to allow outside sub-system to update the forwarding policy of a
channel. This can be triggered either by a new RPC method, or
automatically by some sort of control system which seeks to optimize
fee revenue, or block off channels, etc.
This commit puts a missing piece in place by properly parsing and
validating the per hop payload received in incoming HTLC’s. When
forwarding HTLC’s we ensure that the payload recovered is consistent
with our current forwarding policy. Additionally, when we’re the “exit
node” for a payment, then we ensure that the HTLC extended matches up
with our expectation w.r.t the payment amount to be received.
This commit modifies the HopIterator interface to allow nodes that
receive incoming HTLC’s to make forwarding decisions based on the
returned peer hop information, rather than just the next hop. With this
change, we can now enforce our routing policy, and reject any HTLC’s
that violate the policy.
This commit fixes a slight regression in the logic of the switch by
ensuring that the log commitment timer is only start _after_ we receive
a new commitment signature. Otherwise, the ticker will keep ticking and
possibly settle HTLC’s that’ve yet to be locked in, or waste a
signature causing us to be deprived of a revocation which is required
for us to initiate a new state transition.
Additionally, the commit performs a few minor post-merge clean ups.
This commit fixes some issues in the display of the stats logger which
resulted in: stats being printed even though no forwarding activity
took place, and underflow of integers resulting in weird outputs when
forwarding.
This commit also adds some additional comments and renames the main
forwarding goroutine to its former name.
This commit fixes an issue that would at times cause the htlcManager
which manages the link that’s the final hop to settle in an HTLC flow.
Previously, a case would arise wherein a set of HTLC’s were settled to,
but not properly committed to in the commitment transaction of the
remote node. This wasn’t an issue with HTLC’s which were added but
uncleared, as that batch was tracked independently.
In order to fix this issue, we now track pending HTLC settles
independently. This is a temporary fix, as has been noted in a TODO
within this commit.
In this commit usage of the pending packet queue have been added.
This queue will consume the downstream packets if state machine return
the error that we do not have enough capacity for htlc in commitment
transaction. Upon receiving settle/fail payment descriptors - add htlc
have been removed, we release the slot, and process pending add htlc
requests.
In this commit pending packet queue have been added. This queue
consumes the htlc packet, store it inside the golang list and send it
to the pending channel upon release notification.
Step #5 in making htlcManager (aka channelLink) testable:
Combine all that have been done so far and add test framework for channel
links which allow unit test:
* message ordering
* detect redundant messages
* single hop payment
* multihop payment
* several cancel payment scenarios
Because processing of onion blob have been moved in another place we
could get rid of the variables which are not needed any more.
NOTE: pendingBatch have been replaced with batchCounter variable, but
it should be removed at all, because number of pending batch updates
might be counted by the state machine itself.
Step №4 in making htlcManager (aka channelLink) testable:
This step consist of two:
1. Start using the hop iterator abstraction, the concrete
implementation of which will be added later, basically it will we the
same sphinx onion packet processor, but wrapped in hop iterator
abstraction.
2. The RevokAndAck processing part have been replaced by the
"processLockedInHtlcs" function which implement the same logic, but make
it a bit simpler.
Such changes will allow as to get rid of the the unnecessary variables.
Short: such abstraction give as ability to test the channel link in the
future.
Long: hop iterator represents the entity which is be able to give payment
route hop by hop. This interface will be used to have an abstraction
over the algorithm which we use to determine the next hope in htlc route
and also helps the unit test to create mock representation of such
algorithm which uses simple array of hops.
Step №2 in making htlcManager (aka channelLink) testable:
Implement the ChannelLink interface which is needed to use it in pair
with htlc switch. With this commit channel link impelements interface,
but isn't able to operate properly yet.
In this commit all initial code which will be transformed into channel
link have been added. Rather than changing the in the same commit is
better to create the standalone commit, in order to see the changes
which have been applied to relocated code.
This commit gives the start for making the htlc manager and htlc switch
testable. The testability of htlc switch have been achieved by mocking
all external subsystems. The concrete list of updates:
1. create standalone package for htlc switch.
2. add "ChannelLink" interface, which represent the previous htlc link.
3. add "Peer" interface, which represent the remote node inside our
subsystem.
4. add htlc switch config to htlc switch susbystem, which stores the
handlers which are not elongs to any of the above interfaces.
With this commit we are able test htlc switch even without having
the concrete implementation of Peer, ChannelLink structures, they will
be added later.
Add hop id structure wich represent the next lnd node in sphinx payment
route. This structure will be removed when we switch to use the channel
id as the pointers to the htlc update.