In this commit, we’ve modified the link and the switch to start to use
the new mailBox in place of the existing synchronous message send
directly into the link’s upstream/downstream channels. With his change,
we no longer need to spawn a new goroutine each time an HTLC needs to
be forwarded, or a user payment is initiated.
In this commit, we add a new abstraction to the package: the mailBox.
The mailBox is a non-blocking, concurrent safe, in-order queue for
delivering messages to a given channelLink instance. With this
abstraction in place, we can now allow the switch to no longer launch a
new goroutine for each forwarded HTLC, or instantiated user payment.
After addition of the retransmission logic in the channel link, we
should make the onion blobs persistant, the proper way to do this is
include the onion blobs in the payment descriptor rather than storing
them in the distinct struct in the channel link.
In this commit BOLT№2 retranmission logic for the channel link have
been added. Now if channel link have been initialised with the
'SyncState' field than it will send the lnwire.ChannelReestablish
message and will be waiting for receiving the same message from remote
side. Exchange of this message allow both sides understand which
updates they should exchange with each other in order sync their
states.
In order to be able to properly restart switch several times we should
have the sequential process of channel link stop. In other words if we
stopped the switch we should be sure that all channel links have been
stopped too. Addition of the goroutine during the force close was added
because of the deadlock:
Trace:
1. link:force_close_notification
2. link:wipe_channel
3. peer:switch_remove_link
4. switch:stop_link
5. link:wait <-- deadlock
This commit where added as a measure to avoid the panic during several
server simultanoius fault. The panic happened becuase *t.Testing
structure is not concurrent safe.
In this commit, we address a lingering TODO within the
TestUpdateForwardingPolicy test case to ensure that Bob will reject the
payment the second time around due to an update in his fee policy.
In this commit, we update the TestLinkForwardTimelockPolicyMismatch to
instead _subtract_ time from the first HTLC extended to the initial
hop. We now subtract instead as giving intermediate hops more time
is.now permitted.
In this commit, we relax the time lock verification when we realize
we’re an intermediate hop. We no longer directly assert that the time
lock we receive is _identical_, instead we allow slow slack and will
reject iff, the incoming timelock minus the outgoing time lock doesn’t
meet our delta requirements.
This commit modifies the errors that we return within the
handleLocalDispatch method. Rather than returning a regular error, or
simply the matching error code in some instances, we now _always_
return an instance of ForwardingError. This will allow the router to
make more intelligent decisions w.r.t routing HTLC’s as with this
information it will now be able to differentiate errors that occur
within the switch (before sending out the HTLC), from errors that occur
within the HTLC route itself.
This commit adds a new field to the switch’s Config, namely the public
key of the backing lightning node. This field will soon be used to
return more detailed errors messages back to the ChannelRouter itself.
This commit adds a new field to the ForwardingError struct: ExtraMsg.
The purpose of this field is to allow the htlcswitch to tack on
additional error context to ForwardingError messages returned to the L3
router.
This commit renames the Deobfuscator interface to ErrorDecrypter and
the Obfuscator interface to ErrorEncrypter. With this rename, the
purpose of these two interfaces are a bit clearer.
Additionally, DecryptError (which was formerly Deobfuscate) now
directly returns an ForwardingError type instead of the
lnwire.FailureMessage.
This commit introduces a new type to the package: ForwardingError. It
wraps an existing lnwire.FailureMessage interface, and also includes
the _source_ of the error message. By including the source of the
message, the router can now prune the set of available routes down in
order to reduce the number of subsequent failures based on the source
of the error and the type of the error itself.
This commit fixes an existing bug, wherein if we failed to account for
the fact that if we we’re unable to add an HTLC for any reason other
than an overflown commitment transaction, then we wouldn’t properly
re-add the available bandwidth of the offending HTLC.
This commit modifies the error we return to the end user in the case of
an insufficient link capacity error when handling a local payment
dispatch. Previously we would return a
lnwire.CodeTemporaryChannelFailure, however, this isn’t necessary as
this is a local payment attempt and we don’t give up any sensitive
information by returning the best available bandwidth, and what we need
to complete the payment.
In the case where the channelLink get started and the number of
updates on this channel is zero, this means no paymenys has been
done using this channel. This might mean that the fundingLocked
never was sent successfully, so we resend to make sure this
channel gets opened correctly.
This commit fixes a bug related to swallowing an error that should go
to the switch in the case of an insufficient balance error when
attempting to add a new HTLC to the channel state machine. In this
case, an error would never be returned back to the client/switch, and
the internal processing within the channelLink would loop forever,
attempting to add an HTLC that can’t be added due to insufficient
balance to state machine itself.
We fix this issue by only treating the lnwallet.ErrMaxHTLCNumber as the
only error that prompts adding an HTLC to the overflow queue rather
than sending the error directly back to the switch.
This commit fixes a possible deadlock within the packetQueue that could
be caused by the following circular waiting dependency:
packetCoordinator woken up, grabs lock, queue isn’t empty, attempts to
send packet to link (lock still held) -> channelLink has commitment
overflow, attempts to add new item to packet queue, in AddPkt grabs
Lock -> circular wait.
We avoid this scenario by *not* holding the lock within the
packetCoordinator when we attempt to send a new packet to the switch.
Instead, we release the lock before the second select statement in the
main processing loop.
This commit adds a new test case for the default implementation of the
ChannelLink to ensure that the bandwidth is updated properly in the
face of commitment transaction overflows, and the subsequent draining
of said overflown commitment transaction.
This commit adds a new test for the current default ChannelLink
implementation to ensure that the bandwidth updates for a link are
externally consistent from the PoV of callers after a modifying action.
In this commit, we’ve moved away from the internal queryHandler within
the packetQueue entirely. We now use an internal queueLen variable
internally to allow callers to sample the queue’s size, and also for
synchronization purposes internally.
This commit also introduces a chan struct{} (freeSlots) that is used
internally as a semaphore. The current value of freeSlots reflects the
number of available slots within the commitment transaction. Within the
link, after an HTLC has been removed/modified, then a “slot” is freed
up. The main packetConsumer then interprets these messages as a signal
to attempt to free up a new slot within the queue itself by dumping off
to the commitment transaction.
This commit removes the internal queryHandler within the packetQueue
itself in order to make way for an upcoming commit which uses atomic
variables to report the length of the queue to outside callers.
Additionally, due to the recent change within the channeling, we no
longer need to report the total value of all pending HTLC’s to the
outside world.
This commit modifies the way the bandwidth of a given channel link is
tracked, and reported externally. The prior approach pushed most of the
logic for tracking channel bandwidth into the link itself, and relied
on a report from the queue in order to determine the total available
bandwidth. This approach at times could inadvertently introduce
deadlocks when working on new features as since the query was handled
internally, it required the link to be _active_ and non-blocked in
order to respond to.
We’ve now abandoned this approach in favor of lifting the bandwidth
accounting to the highest possible abstraction layer within the link
itself. We now maintain a availableBandwidth integer that’s used
atomically within the link in response to: us adding+settling an HTLC,
and the remote party failing one of our HTLC’s.
This commit completes a full re-write of the link’s packet overflow
queue with the goals of the making the code itself more understandable
and also allowing it to be more extensible in the future with various
algorithms for handling HTLC congestion avoidance and persistent queue
back pressure.
The new design is simpler and consumes much less coroutines (no longer
a new goroutine for each active HLTC). We now implement a simple
synchronized queue using a standard condition variable.
This commit adds a new debug mode for lnd
called hodlhtlc. This mode instructs a node
to refrain from settling incoming HTLCs for
which it is the exit node. We plan to use
this in testing to more precisely control
the states a node can take during
execution.
This commit fixes an existing bug in the way we perform validation of
the timelock information as the final hop in the route. Previously, we
would assert that the outgoing time lock in the per-hop payload would
exactly match our time lock delta.
Instead, we should be asserting two things:
1. That the time lock in the payload is >= the expected time lock
2. That timeout on the HTLC is exactly equal to the payload