This commit moves the logic handling responses to
locally-initiated payments to be asynchronous. The
reordering of operations into handleLocalDispatch
brings a serious performance burden to the switch's
main event loop. However, the at-most once semantics
of circuit map and idempotency of cleanup methods
allows concurrent operations to run in parallel.
Prior to this commit, the async_payments_benchmark
would timeout due to the forcibly serial nature of
the prior design. With this change, there is no
perceptible difference in the benchmark OMM, even
though we've added two extra db calls.
In this commit, we address an issue that could arise when using the
SendToRoute RPC. In this RPC, we specify the exact hops that a payment
should take. However, within the switch, we would set a constraint for
the first hop to be any hop as long as the first peer was at the end of
it. This would cause discrepancies when attempting to use the RPC as the
payment would actually go through another hop with the same peer. We fix
this by explicitly specifying the channel ID of the first hop.
Fixes#1500.
Fixes#1515.
In this commit, we thread through a link's quit channel into
routeAsync, the primary helper method allowing links to send
htlcPackets through the switch. This is intended to remove
deadlocks from happening, where the link is synchronously
blocking on forwarding packets to the switch, but also
needs to shutdown.
In this commit, we modify the removeLink method to be more asynchronous.
Before this commit, we would attempt to block until the peer exits.
However, it may be the case that at times time, then target link is
attempting to forward a batch of packets to the switch (forwardBatch).
Atm, this method doesn't pass in an external context/quit, so we can't
cancel this message/request. As a result, we'll now ensure that
`removeLink` doesn't block, so we can resume the switch's main loop as
soon as possible.
In this commit, we move the block height dependency from the links in
the switch to the switch itself. This is possible due to a recent change
on the links no longer depending on the block height to update their
commitment fees.
We'll now only have the switch be alerted of new blocks coming in and
links will retrieve the height from it atomically.
In this commit, we fix a bug in the way we handle removing items from
the interfaceIndex. Before this commit, we would delete all items items
with the target public key that of the peer that owns the link being
removed. However, this is incorrect as the peer may have other links
sill active.
In this commit, we fix this by first only deleting the link from the
peer's index, and then checking to see if the index is empty after this
deletion. Only if so do we delete the index for the peer all together.
In this commit, we modify the interfaceIndex to no longer key the second
level of the index by the ChannelLink. Instead, we'll use the chan ID as
it's a stable identifier, unlike a reference to an interface.
In this commit, we fix an existing source of a panic, that could at
times lead to a deadlock. If the circuit returned from closeCircuit
didn't have an outgoing key (as it was an incomplete forward), then we
would attempt to de-ref a nil pointer. This would trigger a panic, and
the runtime would start to unwind the stack, and execute each defer in
line. A deadlock can arise here, as in the defer at the root goroutine,
we need to grab the fwdingEventMtx. However, we already have it at the
panic site.
We fix this issue by ensuring we only attempt to add the event if it's a
_settle_ and also actually has an outgoing circuit (which it should
already, just a defensive check).
In this commit, we ensure that any time we send a TempChannelFailure
that's destined for a multi-hop source sender, then we'll always package
the latest channel update along with it.
In this commit, we update the testUpdateChannelPolicy to exercise the
recent set of changes within the switch. If one applies this test to a
fresh branch (without those new changes) it should fail. This is due to
the fact that before, Bob would attempt to apply the constraints of the
incoming link (which we updated) instead of the outgoing link. With the
recent set of changes, the test now properly passes.
In this commit, we simplify the switch's code a bit. Rather than having
a set of channels we use to mutate or query for the set of current
links, we'll instead now just use a mutex to guard a set of link
indexes. This serves to simplify the ode, and also make it such that we
don't need to block forwarding in order to add/remove a link.
In this commit, we fix a very old, lingering bug within the link. When
accepting an HTLC we are meant to validate the fee against the
constraints of the *outgoing* link. This is due to the fact that we're
offering a payment transit service on our outgoing link. Before this
commit, we would use the policies of the *incoming* link. This would at
times lead to odd routing errors as we would go to route, get an error
update and then route again, repeating the process.
With this commit, we'll properly use the incoming link for timelock
related constraints, and the outgoing link for fee related constraints.
We do this by introducing a new HtlcSatisfiesPolicy method in the link.
This method should return a non-nil error if the link can carry the HTLC
as it satisfies its current forwarding policy. We'll use this method now
at *forwarding* time to ensure that we only forward to links that
actually accept the policy. This fixes a number of bugs that existed
before that could result in a link accepting an HTLC that actually
violated its policy. In the case that the policy is violated for *all*
links, we take care to return the error returned by the *target* link so
the caller can update their sending accordingly.
In this commit, we also remove the prior linkControl channel in the
channelLink. Instead, of sending a message to update the internal link
policy, we'll use a mutex in place. This simplifies the code, and also
adds some necessary refactoring in anticipation of the next follow up
commit.
In this commit, we fix an issue where users would be displayed negative
amounts of satoshis either as sent or received. This can happen if the
total amount of channel updates decreases due to channels being closed.
To fix this, we properly handle a negative difference of channel
updates by updating the stats logged to only include active
channels/links to the switch.
In this commit, we extend the switch as is, to record details
concerning settled payment circuits. To do this, we introduce a new
interface to the package: the ForwardingLog. This is a tiny interface
that simply lets us abstract away the details of the storage backing of
the forwarding log.
Each time we receive a successful HTLC settle, we’ll log the full
details (chans, fees, time) as a pending forwarding log entry. Every 15
seconds, we’ll then batch flush out these entries to disk. When we’re
exiting, we’ll try to flush out all entries to ensure everything gets
recorded to disk.
This commit fixes a deadlock scenario caused when some
switch methods are waiting for a response on the
command's done/err chan. However, no such response will
be delivered if the main event loop has already exited.
This is resolved by selecting on the command's done/err chan
and the server's quit chan simultaneously.
In this commit, we fix an existing bug that would result in some
payments getting “stuck”. This would happen if one side restarted
before the channel was fully locked in. In this case, since upon
re-connection, the link will get added to the switch with a *short
channel ID of zero*. If A then tries to make a multi-hop payment
through B, B will fail to forward the payment, as it’ll mistakenly
think that the payment originated from a local-subsystem as the channel
ID is zero. A short channel ID of zero is used to map local payments
back to their caller.
With fix this by allowing the funding manager to dynamically update the
short channel ID of a link after it discovers the short channel ID.
In this commit, we fix a second instance of reported “stuck” payments
by users.
In this commit, we update the failure case within handleLocalDispatch
to handle locally sourced resolutions. This is the case that we send a
payment out, but before it can even get past the first hop, we need to
go to chain (may have been a cascading failure). Once the HTLC is fully
resolved, we’ll send back a resolution message, however, that message
doesn’t have a failure reason populated. To properly handle this, we’ll
send back a permanent channel failure to the router.
In this commit, we add a new method: ProcessContractResolution. This
will be used by entities of the contract court package to notify us
whenever they discover that we can resolve an incoming contract
off-chain after the outgoing contract was fully resolved on-chain.
We’ll take a contractcourt.ResolutionMsg and map it to the proper
internal package so we can fully resolve an active circuit.
This simplifies the pending payment handling code because it allows it
be handled in nearly the same way as forwarded HTLCs by treating an
empty channel ID as local dispatch.
The src/dest terminology for routing packets is kind of confusing
because the source HTLC may not be the source of the packet for
settles/fails traversing the circuit in the opposite direction. This
changes the nomenclature to incoming/outgoing and always references
the HTLCs themselves.
This changes the circuit map internals and API to reference circuits
by a primary key of (channel ID, HTLC ID) instead of paymnet
hash. This is because each circuit has a unique offered HTLC, but
there may be multiple circuits for a payment hash with different
source or destination channels.
The constructor functions have no additional logic other than passing
function parameters into struct fields. Given the large function
signatures, it is more clear to directly construct the htlcPacket in
client code than call a function with lots of positional arguments.
This commit fixes an existing bug wherein we would incorrectly attempt
to forward and HTLC to a link that wasn’t yet eligible for forwarding.
This would occur when we’ve added a link to the switch, but haven’t yet
received a FundingLocked message for the channel. As a result, the
channel won’t have the next revocation point available. A logic error
prior to this commit would skip tallying the largest bandwidth rather
than skipping examining the link all together.
Fixes#464.
In this commit, when selecting a candidate link to forward a payment,
we’ll ensure that it’s actually able to take on the HTLC. Otherwise,
we’ll skip over the link itself. Currently, a link is only fully
eligible for forwarding, *after* we’ve received and fully processed the
FundingLocked message.
In this commit, we’ve modified the link and the switch to start to use
the new mailBox in place of the existing synchronous message send
directly into the link’s upstream/downstream channels. With his change,
we no longer need to spawn a new goroutine each time an HTLC needs to
be forwarded, or a user payment is initiated.
In this commit BOLT№2 retranmission logic for the channel link have
been added. Now if channel link have been initialised with the
'SyncState' field than it will send the lnwire.ChannelReestablish
message and will be waiting for receiving the same message from remote
side. Exchange of this message allow both sides understand which
updates they should exchange with each other in order sync their
states.
In order to be able to properly restart switch several times we should
have the sequential process of channel link stop. In other words if we
stopped the switch we should be sure that all channel links have been
stopped too. Addition of the goroutine during the force close was added
because of the deadlock:
Trace:
1. link:force_close_notification
2. link:wipe_channel
3. peer:switch_remove_link
4. switch:stop_link
5. link:wait <-- deadlock
This commit modifies the errors that we return within the
handleLocalDispatch method. Rather than returning a regular error, or
simply the matching error code in some instances, we now _always_
return an instance of ForwardingError. This will allow the router to
make more intelligent decisions w.r.t routing HTLC’s as with this
information it will now be able to differentiate errors that occur
within the switch (before sending out the HTLC), from errors that occur
within the HTLC route itself.
This commit adds a new field to the switch’s Config, namely the public
key of the backing lightning node. This field will soon be used to
return more detailed errors messages back to the ChannelRouter itself.
This commit renames the Deobfuscator interface to ErrorDecrypter and
the Obfuscator interface to ErrorEncrypter. With this rename, the
purpose of these two interfaces are a bit clearer.
Additionally, DecryptError (which was formerly Deobfuscate) now
directly returns an ForwardingError type instead of the
lnwire.FailureMessage.
This commit modifies the error we return to the end user in the case of
an insufficient link capacity error when handling a local payment
dispatch. Previously we would return a
lnwire.CodeTemporaryChannelFailure, however, this isn’t necessary as
this is a local payment attempt and we don’t give up any sensitive
information by returning the best available bandwidth, and what we need
to complete the payment.
This commit fixes a bug related to swallowing an error that should go
to the switch in the case of an insufficient balance error when
attempting to add a new HTLC to the channel state machine. In this
case, an error would never be returned back to the client/switch, and
the internal processing within the channelLink would loop forever,
attempting to add an HTLC that can’t be added due to insufficient
balance to state machine itself.
We fix this issue by only treating the lnwallet.ErrMaxHTLCNumber as the
only error that prompts adding an HTLC to the overflow queue rather
than sending the error directly back to the switch.
This commit adds a new method to the HtlcSwitch:
UpdateForwardingPolicies. With this method callers are now able to
modify the forwarding policies of all, or some currently active links.
We also make a slight modification to the way that forwarding policy
updates are handled within the links themselves to ensure that we don’t
override with a zero value for any of the fields.
This commit modifies how the htlcswitch handles close requests.
Previously it could be the case that a new channel was added, but at
the same time a channel was requested to be closed. This would result
in a circular waiting dependency: the peer contacts the switch, who
tries to contact the peer.
We eliminate this possibility by ensuring that the switch handles all
close requests asynchronously. With this, the switch won't block
indefinitely in the scenario described above.
In previous commits we have intoduced the onion errors. Some of this
errors include lnwire.ChannelUpdate message. In order to change
topology accordingly to the received error, from nodes where failure
have occured, we have to propogate the update to the router subsystem.
Within the network, it's important that when an HTLC forwarding failure
occurs, the recipient is notified in a timely manner in order to ensure
that errors are graceful and not unknown. For that reason with
accordance to BOLT №4 onion failure obfuscation have been added.
This commit fixes a regression introduce in the prior commit which
added full verification of the per-hop payloads to the ChannelLink
interface. When this was initially implemented, the added checks
weren’t guarded on the existence of debughtlc’s. As a result,
debughtlc’s would be rejected as they don’t match the expected invoice
value.
This commit fixes that issue by only checking the hop payload if debug
HTLC mode isn’t on.
This commit fixes some issues in the display of the stats logger which
resulted in: stats being printed even though no forwarding activity
took place, and underflow of integers resulting in weird outputs when
forwarding.
This commit also adds some additional comments and renames the main
forwarding goroutine to its former name.
This commit gives the start for making the htlc manager and htlc switch
testable. The testability of htlc switch have been achieved by mocking
all external subsystems. The concrete list of updates:
1. create standalone package for htlc switch.
2. add "ChannelLink" interface, which represent the previous htlc link.
3. add "Peer" interface, which represent the remote node inside our
subsystem.
4. add htlc switch config to htlc switch susbystem, which stores the
handlers which are not elongs to any of the above interfaces.
With this commit we are able test htlc switch even without having
the concrete implementation of Peer, ChannelLink structures, they will
be added later.