In this commit, we fix a goroutine leak that could occur if while we
were loading an error occurred in any of the steps after we created the
channel object, but before it was actually loaded in to the script. If
an error occurs at any step, we ensure that we’ll stop toe channel.
Otherwise, the sigPool goroutines would still be lingering and never be
stopped.
This commit adds a set used to track channels we consider failed. This
is done to ensure we don't end up in a connect/disconnect loop when we
attempt to re-sync the channel state of a failed channel with a peer.
In this commit, we remove the DecodeHopIterator method from the
ChannelLinkConfig struct. We do this as we no longer use this method,
since we only ever use the DecodeHopIterators method now.
In this commit, we modify the msgStream struct to ensure that it has a
cap at which it’ll continue to buffer messages. Currently we have two
msgStream structs per peer: the first for the discovery messages, and
the second for any messages that modify channel state. Due to
inefficiencies in the current protocol for reconciling graph state upon
connection (just dump the entire damn thing), when a node first starts
up, this can lead to very high memory usage as all peers will
concurrently send their initial message dump which can be in the
thousands of messages on testate.
Our fix is simple: make the message stream into a _bounded_ message
stream. The newMsgStream function now has a new argument: bufSize.
Internally, we’ll take this bufSize and create more or less an internal
semaphore for the producer. Each time the producer gets a new message,
it’ll try and read an item from the channel. If the queue still has
size, then this will succeed immediately. If not, then we’ll block
until the consumer actually finishes processing a message and then
signals by sending a new item into the channel.
We choose an initial value of 1000. This was chosen as there’s already
a max limit of outstanding adds on the commitment, and a value of 1000
should allow any incoming messages to be safely flushed and processed
by the gossiper.
In this commit, we fix a slight miscalculation within the GetInfo call.
Before this commit, we would list any channel that the peer knew of as
active, instead of those which are, well, actually *active*. We fix
this by skipping any channels that we don’t have the remote revocation
for.
In order to reduce high CPU utilization during the initial network view
sync, we slash down the total number of active in-flight jobs that can
be launched.
In this commit, we modify the way that notifications are dispatched
within the chainWatcher. Before we would *always* wait for an ack back
before we started to clean up he database state. This would at times
lead to deadlocks. To remedy this, we now allow callers to decide if
they want notifications to be sync or not. The only current caller that
requires this is the breach arbiter.
In this commit, we modify the interaction between the chanCloser
sub-system and the chain notifier all together. This fixes a series of
bugs as before this commit, we wouldn’t be able to detect if the remote
party actually broadcasted *any* of the transactions that we signed off
upon. This would be rejected to the user by having a “zombie” channel
close that would never actually be resolved.
Rather than the chanCloser watching for on-chain closes, we’ll now open
up a co-op close context to the chainWatcher (via a layer of
indirection via the ChainArbitrator), and report to it all possible
closes that we’ve signed. The chainWatcher will then be able to launch
a goroutine to properly update the database state once any of the
possible closure transactions confirms.
We no longer need to hand off new channels that come online as the
chainWatcher will be persistent, and always have an active signal for
the entire lifetime of the channel.
In this commit, we modify the logic within the Stop() method for
msgStream to ensure that the main goroutine properly exits. It has been
observed on running nodes with tens of connections, that if a node is
very flappy, then the node can end up with hundreds of leaked
goroutines.
In order to fix this, we’ll continually signal the msgConsumer to wake
up after the quit channel has been closed. We do this until the
msgConsumer sets a bool indicating that it has exited atomically.
This commit adds an overlooked case into the main type switch statement
within the peer’s readHandler. Before this commit, we would fail to
process any UpdateFailMalformedHTLC messages, possibly leading to a
commitment desynchronization. To avoid this case, we’ll no properly
process the UpdateFailMalformedHTLC message by sending the message to
an active link registered to the switch.
In this commit, we modify the logWireMessage function to ensure that we
don't attempt to nil out the LocalUnrevokedCommitPoint.Curve field
unless it's actually set. We need to do this as the field as actually
optional, and we may be reading a message from a node that doesn't
support the option.
Fixes#461.
This commit is a follow up to the prior commit: as it’s possible for
the channel_reestablish message to be sent *before* the channel has
been fully confirmed, we’ll now ensure that we process it to the link
even if the channel isn’t yet open.
In this commit, we modify the logic within loadActiveChannels to
*always* load a channel, even if it isn’t yet fully confirmed. With
this change, we ensure that we’ll always send a channel_reestablish
message upon reconnection.
Fixes#458.
In this commit, we modify the logic within the channelManager to be
able to process any retransmitted FundingLocked messages. Before this
commit, we would simply ignore any new channels sent to us, iff, we
already had an active channel with the same channel point. With the
recent change to the loadActiveChannels method in the peer, this is now
incorrect.
When a peer retransmits the FundingLocked message, it goes through to
the fundingManager. The fundingMgr will then (if we haven’t already
processed it), send the channel to the breach arbiter and also to the
peer’s channelManager. In order to handle this case properly, if we
already have the channel, we’ll check if our current channel *doesn’t*
already have the RemoteNextRevocation field set. If it doesn’t, then
this means that we haven’t yet processed the FundingLcoked message, so
we’ll process it for the first time.
This new logic will properly:
* ensure that the breachArbiter still has the most up to date channel
* allow us to update the state of the link has been added to the
switch at this point
* this link will now be eligible for forwarding after this
sequence
In this commit we revert a prior change which was added after
FundingLocked retransmission was implemented. This prior change didn’t
factor in the fact that the FundingLocked message will *only* be
re-sent after both sides receive the ChannelReestablishment message.
With the prior code, as we never added the channel to the link, we’d
never re-send the ChannelReestablishment, meaning the other side would
never send the FundingLocked message.
By unconditionally adding the channel to the switch, we ensure that
we’ll always properly retransmit the FundingLocked message.
This new field was added as a recent modification to the spec, but the
curve parameter within the attribute wasn’t set to nil. As a result
this would result in a large degree of spam within the logs when set to
trace mode. This commit fixes this issue by setting it to nil along
with all the other pub keys within messages.
It dictates in the spec, that the error message should be an ASCII
string to allow other implementations to easily discern the type of
error. The other implementations do this, but we don’t yet, but we’ll
go ahead and display it anyway as it’s helpful when debugging.
In this commit, we add a new ResetState method to the channel state
machine which will reset the state of the channel to `channelOpen`. We
add this as before this commit, it was possible for a channel to shift
into the closing state, the closing negotiation be cancelled for
whatever reason, resulting the the channel held by the breachArbiter
unable to act to potential on-chain events.
In this commit, we refactor the existing channel closure logic for
co-op closes to use the new channelCloser state machine. This results
in a large degree of deleted code as all the logic is now centralized
to a single state machine.
This commit addresses an issue that could occur if a
message was attempted added to the sendQueue by the
queueHandler before the writeHandler had started.
If a message was sent to the queueHandler before the
writeHandler was ready to accept messages on the
sendQueue, the message would be added to the
pendingMsg queue, but would not be attempted sent
on the sendQueue again before a new incoming message
triggered a new attempt.
In this commit BOLT№2 retranmission logic for the channel link have
been added. Now if channel link have been initialised with the
'SyncState' field than it will send the lnwire.ChannelReestablish
message and will be waiting for receiving the same message from remote
side. Exchange of this message allow both sides understand which
updates they should exchange with each other in order sync their
states.
This commit refactors the core logic of the
chanMsgStream to support an additional stream
that is used to asynchronously queue for in-order
delivery to the authenticated gossiper. The channel
streams are slightly adapted to use the more flexible
primitive. We may look to refactor this using more
isolated interfaces, but for now this provides a
minimal change to resolving known flakes.
In this commit we fix an existing bug within the msgConsumer grouting
of the chanMsgStream that could result in a partial deadlock, as the
readHandler would no longer be able to add messages to the message
queue. The primary cause of this issue would be if we got an update for
a channel that “we don’t know of”. The main loop would continue,
leaving the mutex unlocked. We would then try to re-lock at the top of
the loop, leading to a deadlock.
We avoid this situation by properly unlocking the condition variable as
soon as we’re done modifying the condition itself.
In this commit we add the set of local features advertised as a
parameter to the newPeer function. With this change, the server will be
able to programmatically determine _which_ bits should be set on a
connection basis, rather than re-using the same global set of bits for
each peer.
This commit fills in an existing logging gap by adding a new set of
message summaries that is shown for the debug logging level.
Before this commit, if a user wanted to get a close up feel for what
lnd was doing under the covers, they had to use the trace logging
level. Trace can be very verbose, so we now provide a debug logging
level with message “summaries”. The summaries may not contain all the
data in the message, hut have been crafted in order to provide
sufficient detail at a glance.
This commit fixes an existing bug within the iteration between the
queueHandler and the writeHandler. Under certain scenarios, if the
writeHandler was blocked for a non negligible period of time, then the
queueHandler would enter a very tight spinning loop. This was due to
the fact that the break statement in the inner select loop of the
queueHandler wouldn’t actually break the inner for loop, instead it
would cause the execution logic to re-enter that same select loop,
causing a very tight spin.
In this commit, we fix the issue by adding to things: we now label the
inner select loop so we can break out of it if we detect that the
writeHandler has blocked. Secondly, we introduce a new channel between
the queueHandler and the writeHandler to signal the queueHandler that
the writeHandler has finished processing the last message.
In this commit, we add an idle timer to the readHandler itself. This
will serve to slowly prune away inactive TCP connections as a result of
remote peer being blocked either upon reading or writing to the socket.
Our ping timer interval is 1 minute, so an idle timer interval of 5
minutes seem reasonable.
This commit fixes a bug to wrap up the recently merged PR to properly
handle duplicate FundingLocked retransmissions and also ensure that we
reliably re-send the FundingLocked message if we’re unable to the first
time around.
In this commit, we skip processing a channel that does not yet have a
set remote revocation as otherwise, if we attempt to trigger a state
update, then we’ll be attempting to manipulate a nil commitment point.
Therefore, we’ll rely on the fundingManager to properly send the
channel all relevant subsystems.
This commit adds a new debug mode for lnd
called hodlhtlc. This mode instructs a node
to refrain from settling incoming HTLCs for
which it is the exit node. We plan to use
this in testing to more precisely control
the states a node can take during
execution.
This commit adds a precautionary check for the error returned if the
channel hasn’t yet been announced when attempting to read the our
current routing policy to initialize the channelLink for a channel.
Previously, if the channel wasn’t they announced, the function would
return early instead of using the default policy.
We also include another bug fix, that avoids a possible nil pointer
panic in the case that the ChannelEdgeInfo reread form the graph is
nil.
This commit fixes a bug that could arise if either we had not, or the
remote party had not advertised a routing policy for either outgoing
channel edge. In this commit, we now detect if a policy wasn’t
advertised, falling back to the default routing policy if so.
Fixes#259.
This commit fixes a lingering bug within the logic for the
peer/htlcswitch/channellink. When the link needs to fetch the latest
update to send to a sending party due to a violation of the set routing
policy, previously it would modify the timestamp on the message read
from disk. This was incorrect as it would invalidate the signature
within the message itself. We fix this by instead
This commit adds another conditional send select statement to ensure
that when sending the finalized contract to the breach arbiter, the
peer doesn’t possible cause the daemon to hang on shutdown.
This commit modifies the logic when we are loading alll the channels
that we have with a particular peer to grab the current committed
forwarding policy from disk rather then using the default forwarding
policy. We do this as it’s now possible for active channels to have
distinct forwarding policies.
This commit fixes a possible deadlock bug that may arise during
shutdown due to an unconditional send on a channel to the breach
arbiter. We do this on two occasions within the peer: when loading a
new contract to give it the live version, and also when closing a
channel to ensure that it no longer watches over it.
Previously it was possible for these sends to block indefinitely in the
scenario that the server was shutting down (which means the breach
arbiter) is. As a result, the channel would never be drained, meaning
the server couldn’t complete shutdown as the peer hadn’t exited yet.
This commit adds the fee negotiation procedure performed
on channel shutdown. The current algorithm picks an ideal
a fee based on the FeeEstimator and commit weigth, then
accepts the remote's fee if it is at most 50%-200% away
from the ideal. The fee negotiation procedure is similar
both as sender and receiver of the initial shutdown
message, and this commit also make both sides use the
same code path for handling these messages.
This commit fixes a bug which was covered by the recent server
refactoring wherein the grouting would be stuck on the send over the
message channel in the case that the handshake failed. This blockage
would create a deadlock now that the ConnectToPeer method is full
synchronous.
We fix this issue by ensuring the goroutine properly exits.
In addition to improved synchronization between the client
and server, this commit also moves the channel snapshotting
procedure such that it is handled without submitting a query
to the primary select statement. This is primarily done as a
precaution to ensure that no deadlocks occur, has channel
snapshotting has the potential to block restarts.
This commit ensures that all references within the chanMsgStreams are
all removed and deleted when the readHandler exits. This ensures that
all objects don’t have extra references, and will properly be garbage
collected.
This commit fixes a bug that existed in the prior scheme we used to
synchronize between the funding manager and the peer’s readHandler.
Previously, it was possible for messages to be re-ordered before the
reached the target ChannelLink. This would result in commitment
failures as the state machine assumes a strict in-order message
delivery. This would be manifested due to the goroutine that was
launched in the case of a pending channel funding.
The new approach using the chanMsgStream is much simpler, and easier to
read. It should also be a bit snappier, as we’ll no longer at times
create a goroutine for each message.
This commit modifies the channel close negotiation workflow to instead
take not of the fat that with the new funding workflow, the delivery
scripts are no longer pre-committed to at the start of the funding
workflow. Instead, both sides present their delivery addresses at the
start of the shutdown process, then use those to create the final
cooperative closure transaction.
To accommodate for this new change, we now have an intermediate staging
area where we store the delivery scripts for both sides.
In this commit daemon have been changed to set the proper hooks in the
channel link and switch subsystems so that they could send and receive
encrypted onion errors.
In current commit big shift have been made in direction of unit testable
payments scenarios. Previosly two additional structures have been added
which had been spreaded in the lnd package before, and now we apply
them in the lnd itself:
1. ChannelLink - is an interface which represents the subsystem for
managing the incoming htlc requests, applying the changes to the
channel, and also propagating/forwarding it to htlc switch.
2. Switch - is a central messaging bus for all incoming/outgoing htlc's.
The goal of the switch is forward the incoming/outgoing htlc messages
from one channel to another, and also propagate the settle/fail htlc
messages back to original requester.
With this abtractions the folowing schema becomes nearly complete:
abstraction
^
|
| - - - - - - - - - - - - Lightning - - - - - - - - - - - - -
|
| (Switch) (Switch) (Switch)
| Alice <-- channel link --> Bob <-- channel link --> Carol
|
| - - - - - - - - - - - - - TCP - - - - - - - - - - - - - - -
|
| (Peer) (Peer) (Peer)
| Alice <----- tcp conn --> Bob <---- tcp conn -----> Carol
This commit adds a set of additional comments around the new channel
closure workflow and also includes two minor fixes:
* The error when parsing a signature previously wasn’t checked and is
now.
* As a result, we should only track the new signature iff it parses
correctly and we agree to the details as specified w.r.t to the fee
for the final closing transaction.
Additionally, as set of TODO’s has been added detailing the additional
work that needs to be done before the closing workflow is fully
compliant with the specification.