In this commit, we fix a bug that could at times cause a deadlock when a
peer is attempting to disconnect. The issue was that when a peer goes to
disconnect, it needs to stop any active msgStream instances. The Stop()
method of the msgStream would block until an atomic variable was set to
indicate that the stream had fully exited. However, in the case that we
disconnected lower in the msgConsumer loop, we would never set the
streamShutdown variable, meaning that msgStream.Stop() would never
unblock.
The fix for this is simple: set the streamShutdown variable within the
quit case of the second select statement in the msgConsumer goroutine.
In this commit, we might a very small change to the way writing messages
works in the peer, which should have large implications w.r.t reducing
memory usage amongst chatty nodes.
When profiling the heap on one of my nodes earlier, I noticed this
fragment:
```
Showing top 20 nodes out of 68
flat flat% sum% cum cum%
0 0% 0% 75.53MB 54.61% main.(*peer).writeHandler
75.53MB 54.61% 54.61% 75.53MB 54.61% main.(*peer).writeMessage
```
Which points to an inefficiency with the way we handle allocations when
writing new messages, drilling down further we see:
```
(pprof) list writeMessage
Total: 138.31MB
ROUTINE ======================== main.(*peer).writeMessage in /root/go/src/github.com/lightningnetwork/lnd/peer.go
75.53MB 75.53MB (flat, cum) 54.61% of Total
. . 1104: p.logWireMessage(msg, false)
. . 1105:
. . 1106: // As the Lightning wire protocol is fully message oriented, we only
. . 1107: // allows one wire message per outer encapsulated crypto message. So
. . 1108: // we'll create a temporary buffer to write the message directly to.
75.53MB 75.53MB 1109: var msgPayload [lnwire.MaxMessagePayload]byte
. . 1110: b := bytes.NewBuffer(msgPayload[0:0:len(msgPayload)])
. . 1111:
. . 1112: // With the temp buffer created and sliced properly (length zero, full
. . 1113: // capacity), we'll now encode the message directly into this buffer.
. . 1114: n, err := lnwire.WriteMessage(b, msg, 0)
(pprof) list writeHandler
Total: 138.31MB
ROUTINE ======================== main.(*peer).writeHandler in /root/go/src/github.com/lightningnetwork/lnd/peer.go
0 75.53MB (flat, cum) 54.61% of Total
. . 1148:
. . 1149: // Write out the message to the socket, closing the
. . 1150: // 'sentChan' if it's non-nil, The 'sentChan' allows
. . 1151: // callers to optionally synchronize sends with the
. . 1152: // writeHandler.
. 75.53MB 1153: err := p.writeMessage(outMsg.msg)
. . 1154: if outMsg.errChan != nil {
. . 1155: outMsg.errChan <- err
. . 1156: }
. . 1157:
. . 1158: if err != nil {
```
Ah hah! We create a _new_ buffer each time we want to write a message
out. This is unnecessary and _very_ wasteful (as seen by the profile).
The fix is simple: re-use a buffer unique to each peer when writing out
messages. Since we know what the max message size is, we just allocate
one of these 65KB buffers for each peer, and keep it around until the
peer is removed.
In this commit, we follow up to the prior commit by ensuring we won't
accept a co-op close request for a chennel with active HTLCs. When
creating a chanCloser for the first time, we'll check the set of HTLC's
and reject a request (by sending a wire error) if the target channel
still as active HTLC's.
In this commit, we fix a minor deviation in our implementation from the
specification. Before if we encountered an unknown error type, we would
disconnect the peer. Instead, we’ll now just continue along parsing the
remainder of the messages. This was flared up recently by some
c-lightning related incompatibilities that emerged on main net.
In this commit, we fix a goroutine leak that could occur if while we
were loading an error occurred in any of the steps after we created the
channel object, but before it was actually loaded in to the script. If
an error occurs at any step, we ensure that we’ll stop toe channel.
Otherwise, the sigPool goroutines would still be lingering and never be
stopped.
This commit adds a set used to track channels we consider failed. This
is done to ensure we don't end up in a connect/disconnect loop when we
attempt to re-sync the channel state of a failed channel with a peer.
In this commit, we remove the DecodeHopIterator method from the
ChannelLinkConfig struct. We do this as we no longer use this method,
since we only ever use the DecodeHopIterators method now.
In this commit, we modify the msgStream struct to ensure that it has a
cap at which it’ll continue to buffer messages. Currently we have two
msgStream structs per peer: the first for the discovery messages, and
the second for any messages that modify channel state. Due to
inefficiencies in the current protocol for reconciling graph state upon
connection (just dump the entire damn thing), when a node first starts
up, this can lead to very high memory usage as all peers will
concurrently send their initial message dump which can be in the
thousands of messages on testate.
Our fix is simple: make the message stream into a _bounded_ message
stream. The newMsgStream function now has a new argument: bufSize.
Internally, we’ll take this bufSize and create more or less an internal
semaphore for the producer. Each time the producer gets a new message,
it’ll try and read an item from the channel. If the queue still has
size, then this will succeed immediately. If not, then we’ll block
until the consumer actually finishes processing a message and then
signals by sending a new item into the channel.
We choose an initial value of 1000. This was chosen as there’s already
a max limit of outstanding adds on the commitment, and a value of 1000
should allow any incoming messages to be safely flushed and processed
by the gossiper.
In this commit, we fix a slight miscalculation within the GetInfo call.
Before this commit, we would list any channel that the peer knew of as
active, instead of those which are, well, actually *active*. We fix
this by skipping any channels that we don’t have the remote revocation
for.
In order to reduce high CPU utilization during the initial network view
sync, we slash down the total number of active in-flight jobs that can
be launched.
In this commit, we modify the way that notifications are dispatched
within the chainWatcher. Before we would *always* wait for an ack back
before we started to clean up he database state. This would at times
lead to deadlocks. To remedy this, we now allow callers to decide if
they want notifications to be sync or not. The only current caller that
requires this is the breach arbiter.
In this commit, we modify the interaction between the chanCloser
sub-system and the chain notifier all together. This fixes a series of
bugs as before this commit, we wouldn’t be able to detect if the remote
party actually broadcasted *any* of the transactions that we signed off
upon. This would be rejected to the user by having a “zombie” channel
close that would never actually be resolved.
Rather than the chanCloser watching for on-chain closes, we’ll now open
up a co-op close context to the chainWatcher (via a layer of
indirection via the ChainArbitrator), and report to it all possible
closes that we’ve signed. The chainWatcher will then be able to launch
a goroutine to properly update the database state once any of the
possible closure transactions confirms.
We no longer need to hand off new channels that come online as the
chainWatcher will be persistent, and always have an active signal for
the entire lifetime of the channel.
In this commit, we modify the logic within the Stop() method for
msgStream to ensure that the main goroutine properly exits. It has been
observed on running nodes with tens of connections, that if a node is
very flappy, then the node can end up with hundreds of leaked
goroutines.
In order to fix this, we’ll continually signal the msgConsumer to wake
up after the quit channel has been closed. We do this until the
msgConsumer sets a bool indicating that it has exited atomically.
This commit adds an overlooked case into the main type switch statement
within the peer’s readHandler. Before this commit, we would fail to
process any UpdateFailMalformedHTLC messages, possibly leading to a
commitment desynchronization. To avoid this case, we’ll no properly
process the UpdateFailMalformedHTLC message by sending the message to
an active link registered to the switch.
In this commit, we modify the logWireMessage function to ensure that we
don't attempt to nil out the LocalUnrevokedCommitPoint.Curve field
unless it's actually set. We need to do this as the field as actually
optional, and we may be reading a message from a node that doesn't
support the option.
Fixes#461.
This commit is a follow up to the prior commit: as it’s possible for
the channel_reestablish message to be sent *before* the channel has
been fully confirmed, we’ll now ensure that we process it to the link
even if the channel isn’t yet open.
In this commit, we modify the logic within loadActiveChannels to
*always* load a channel, even if it isn’t yet fully confirmed. With
this change, we ensure that we’ll always send a channel_reestablish
message upon reconnection.
Fixes#458.
In this commit, we modify the logic within the channelManager to be
able to process any retransmitted FundingLocked messages. Before this
commit, we would simply ignore any new channels sent to us, iff, we
already had an active channel with the same channel point. With the
recent change to the loadActiveChannels method in the peer, this is now
incorrect.
When a peer retransmits the FundingLocked message, it goes through to
the fundingManager. The fundingMgr will then (if we haven’t already
processed it), send the channel to the breach arbiter and also to the
peer’s channelManager. In order to handle this case properly, if we
already have the channel, we’ll check if our current channel *doesn’t*
already have the RemoteNextRevocation field set. If it doesn’t, then
this means that we haven’t yet processed the FundingLcoked message, so
we’ll process it for the first time.
This new logic will properly:
* ensure that the breachArbiter still has the most up to date channel
* allow us to update the state of the link has been added to the
switch at this point
* this link will now be eligible for forwarding after this
sequence
In this commit we revert a prior change which was added after
FundingLocked retransmission was implemented. This prior change didn’t
factor in the fact that the FundingLocked message will *only* be
re-sent after both sides receive the ChannelReestablishment message.
With the prior code, as we never added the channel to the link, we’d
never re-send the ChannelReestablishment, meaning the other side would
never send the FundingLocked message.
By unconditionally adding the channel to the switch, we ensure that
we’ll always properly retransmit the FundingLocked message.
This new field was added as a recent modification to the spec, but the
curve parameter within the attribute wasn’t set to nil. As a result
this would result in a large degree of spam within the logs when set to
trace mode. This commit fixes this issue by setting it to nil along
with all the other pub keys within messages.
It dictates in the spec, that the error message should be an ASCII
string to allow other implementations to easily discern the type of
error. The other implementations do this, but we don’t yet, but we’ll
go ahead and display it anyway as it’s helpful when debugging.
In this commit, we add a new ResetState method to the channel state
machine which will reset the state of the channel to `channelOpen`. We
add this as before this commit, it was possible for a channel to shift
into the closing state, the closing negotiation be cancelled for
whatever reason, resulting the the channel held by the breachArbiter
unable to act to potential on-chain events.
In this commit, we refactor the existing channel closure logic for
co-op closes to use the new channelCloser state machine. This results
in a large degree of deleted code as all the logic is now centralized
to a single state machine.
This commit addresses an issue that could occur if a
message was attempted added to the sendQueue by the
queueHandler before the writeHandler had started.
If a message was sent to the queueHandler before the
writeHandler was ready to accept messages on the
sendQueue, the message would be added to the
pendingMsg queue, but would not be attempted sent
on the sendQueue again before a new incoming message
triggered a new attempt.
In this commit BOLT№2 retranmission logic for the channel link have
been added. Now if channel link have been initialised with the
'SyncState' field than it will send the lnwire.ChannelReestablish
message and will be waiting for receiving the same message from remote
side. Exchange of this message allow both sides understand which
updates they should exchange with each other in order sync their
states.
This commit refactors the core logic of the
chanMsgStream to support an additional stream
that is used to asynchronously queue for in-order
delivery to the authenticated gossiper. The channel
streams are slightly adapted to use the more flexible
primitive. We may look to refactor this using more
isolated interfaces, but for now this provides a
minimal change to resolving known flakes.
In this commit we fix an existing bug within the msgConsumer grouting
of the chanMsgStream that could result in a partial deadlock, as the
readHandler would no longer be able to add messages to the message
queue. The primary cause of this issue would be if we got an update for
a channel that “we don’t know of”. The main loop would continue,
leaving the mutex unlocked. We would then try to re-lock at the top of
the loop, leading to a deadlock.
We avoid this situation by properly unlocking the condition variable as
soon as we’re done modifying the condition itself.
In this commit we add the set of local features advertised as a
parameter to the newPeer function. With this change, the server will be
able to programmatically determine _which_ bits should be set on a
connection basis, rather than re-using the same global set of bits for
each peer.
This commit fills in an existing logging gap by adding a new set of
message summaries that is shown for the debug logging level.
Before this commit, if a user wanted to get a close up feel for what
lnd was doing under the covers, they had to use the trace logging
level. Trace can be very verbose, so we now provide a debug logging
level with message “summaries”. The summaries may not contain all the
data in the message, hut have been crafted in order to provide
sufficient detail at a glance.
This commit fixes an existing bug within the iteration between the
queueHandler and the writeHandler. Under certain scenarios, if the
writeHandler was blocked for a non negligible period of time, then the
queueHandler would enter a very tight spinning loop. This was due to
the fact that the break statement in the inner select loop of the
queueHandler wouldn’t actually break the inner for loop, instead it
would cause the execution logic to re-enter that same select loop,
causing a very tight spin.
In this commit, we fix the issue by adding to things: we now label the
inner select loop so we can break out of it if we detect that the
writeHandler has blocked. Secondly, we introduce a new channel between
the queueHandler and the writeHandler to signal the queueHandler that
the writeHandler has finished processing the last message.
In this commit, we add an idle timer to the readHandler itself. This
will serve to slowly prune away inactive TCP connections as a result of
remote peer being blocked either upon reading or writing to the socket.
Our ping timer interval is 1 minute, so an idle timer interval of 5
minutes seem reasonable.
This commit fixes a bug to wrap up the recently merged PR to properly
handle duplicate FundingLocked retransmissions and also ensure that we
reliably re-send the FundingLocked message if we’re unable to the first
time around.
In this commit, we skip processing a channel that does not yet have a
set remote revocation as otherwise, if we attempt to trigger a state
update, then we’ll be attempting to manipulate a nil commitment point.
Therefore, we’ll rely on the fundingManager to properly send the
channel all relevant subsystems.
This commit adds a new debug mode for lnd
called hodlhtlc. This mode instructs a node
to refrain from settling incoming HTLCs for
which it is the exit node. We plan to use
this in testing to more precisely control
the states a node can take during
execution.
This commit adds a precautionary check for the error returned if the
channel hasn’t yet been announced when attempting to read the our
current routing policy to initialize the channelLink for a channel.
Previously, if the channel wasn’t they announced, the function would
return early instead of using the default policy.
We also include another bug fix, that avoids a possible nil pointer
panic in the case that the ChannelEdgeInfo reread form the graph is
nil.
This commit fixes a bug that could arise if either we had not, or the
remote party had not advertised a routing policy for either outgoing
channel edge. In this commit, we now detect if a policy wasn’t
advertised, falling back to the default routing policy if so.
Fixes#259.
This commit fixes a lingering bug within the logic for the
peer/htlcswitch/channellink. When the link needs to fetch the latest
update to send to a sending party due to a violation of the set routing
policy, previously it would modify the timestamp on the message read
from disk. This was incorrect as it would invalidate the signature
within the message itself. We fix this by instead
This commit adds another conditional send select statement to ensure
that when sending the finalized contract to the breach arbiter, the
peer doesn’t possible cause the daemon to hang on shutdown.
This commit modifies the logic when we are loading alll the channels
that we have with a particular peer to grab the current committed
forwarding policy from disk rather then using the default forwarding
policy. We do this as it’s now possible for active channels to have
distinct forwarding policies.
This commit fixes a possible deadlock bug that may arise during
shutdown due to an unconditional send on a channel to the breach
arbiter. We do this on two occasions within the peer: when loading a
new contract to give it the live version, and also when closing a
channel to ensure that it no longer watches over it.
Previously it was possible for these sends to block indefinitely in the
scenario that the server was shutting down (which means the breach
arbiter) is. As a result, the channel would never be drained, meaning
the server couldn’t complete shutdown as the peer hadn’t exited yet.
This commit adds the fee negotiation procedure performed
on channel shutdown. The current algorithm picks an ideal
a fee based on the FeeEstimator and commit weigth, then
accepts the remote's fee if it is at most 50%-200% away
from the ideal. The fee negotiation procedure is similar
both as sender and receiver of the initial shutdown
message, and this commit also make both sides use the
same code path for handling these messages.
This commit fixes a bug which was covered by the recent server
refactoring wherein the grouting would be stuck on the send over the
message channel in the case that the handshake failed. This blockage
would create a deadlock now that the ConnectToPeer method is full
synchronous.
We fix this issue by ensuring the goroutine properly exits.
In addition to improved synchronization between the client
and server, this commit also moves the channel snapshotting
procedure such that it is handled without submitting a query
to the primary select statement. This is primarily done as a
precaution to ensure that no deadlocks occur, has channel
snapshotting has the potential to block restarts.
This commit ensures that all references within the chanMsgStreams are
all removed and deleted when the readHandler exits. This ensures that
all objects don’t have extra references, and will properly be garbage
collected.
This commit fixes a bug that existed in the prior scheme we used to
synchronize between the funding manager and the peer’s readHandler.
Previously, it was possible for messages to be re-ordered before the
reached the target ChannelLink. This would result in commitment
failures as the state machine assumes a strict in-order message
delivery. This would be manifested due to the goroutine that was
launched in the case of a pending channel funding.
The new approach using the chanMsgStream is much simpler, and easier to
read. It should also be a bit snappier, as we’ll no longer at times
create a goroutine for each message.
This commit modifies the channel close negotiation workflow to instead
take not of the fat that with the new funding workflow, the delivery
scripts are no longer pre-committed to at the start of the funding
workflow. Instead, both sides present their delivery addresses at the
start of the shutdown process, then use those to create the final
cooperative closure transaction.
To accommodate for this new change, we now have an intermediate staging
area where we store the delivery scripts for both sides.
In this commit daemon have been changed to set the proper hooks in the
channel link and switch subsystems so that they could send and receive
encrypted onion errors.
In current commit big shift have been made in direction of unit testable
payments scenarios. Previosly two additional structures have been added
which had been spreaded in the lnd package before, and now we apply
them in the lnd itself:
1. ChannelLink - is an interface which represents the subsystem for
managing the incoming htlc requests, applying the changes to the
channel, and also propagating/forwarding it to htlc switch.
2. Switch - is a central messaging bus for all incoming/outgoing htlc's.
The goal of the switch is forward the incoming/outgoing htlc messages
from one channel to another, and also propagate the settle/fail htlc
messages back to original requester.
With this abtractions the folowing schema becomes nearly complete:
abstraction
^
|
| - - - - - - - - - - - - Lightning - - - - - - - - - - - - -
|
| (Switch) (Switch) (Switch)
| Alice <-- channel link --> Bob <-- channel link --> Carol
|
| - - - - - - - - - - - - - TCP - - - - - - - - - - - - - - -
|
| (Peer) (Peer) (Peer)
| Alice <----- tcp conn --> Bob <---- tcp conn -----> Carol
This commit adds a set of additional comments around the new channel
closure workflow and also includes two minor fixes:
* The error when parsing a signature previously wasn’t checked and is
now.
* As a result, we should only track the new signature iff it parses
correctly and we agree to the details as specified w.r.t to the fee
for the final closing transaction.
Additionally, as set of TODO’s has been added detailing the additional
work that needs to be done before the closing workflow is fully
compliant with the specification.
This commit changes the cooperative channel close workflow to comply
with the latest spec. This adds steps to handle and send shutdown
messages as well as moving responsibility for sending the channel close
message from the initiator to the responder.
This commit fixes an issue that would at times cause the htlcManager
which manages the link that’s the final hop to settle in an HTLC flow.
Previously, a case would arise wherein a set of HTLC’s were settled to,
but not properly committed to in the commitment transaction of the
remote node. This wasn’t an issue with HTLC’s which were added but
uncleared, as that batch was tracked independently.
In order to fix this issue, we now track pending HTLC settles
independently. This is a temporary fix, as has been noted in a TODO
within this commit.
This commit fixed an issue in the htlcManager goroutine which manages
channel state updates. Due to lack of a mutex protecting the two maps
written in the goroutine launched to forward HTLC’s to the switch.
This issue was detected by golang’s runtime which is able to detect
invalid concurrent map writes.
This commit adds the FeeEstimator interface, which can be used for
future fee calculation implementations. Currently, there is only the
StaticFeeEstimator implementation, which returns the same fee rate for
any transaction.
This commit reverts a prior commit
178f26b8d5ef14b437b9d8d1755bd238212b4dec that introduced a scenario
that could cause a state desynchronization and/or a few extraneous
commitment updates. To avoid such cases, the commitment tick timer is
now only started after _receiving_ a commitment update.
This commit fixes a panic bug in the watiForChanToClose method caused
by a logic error leading to the return value of the function at times
being a nil pointer in the case that an error occurred. We now avoid
such an error by _always_ returning from the function if there’s an
error, but conditionally (in a diff if-clause) sending an error over
the error channel.
This commit fixes a bug which could at times cause channels to be
unusable upon connection. The bug would manifest like the following:
two peers would connect, one loads their channels faster than the
other, this would result in the winning peer attempting to extend their
revocation window. However, if the other peer hadn’t yet loaded the
channel, then this would appear to them to be an unknown channel.
We properly fix this issue by ensure all channels are loaded _before_
any of the goroutines needed for the operation of the peer are
launched.
Within this commit the peer will now properly manage the channel close
life cycle within the database. This entails marking the channel as
pending closed either once the closing transaction has been broadcast
or the close request message has been sent to the other side.
Once the closing transaction has been confirmed, the transaction will
be marked as fully closed within the database. A helper function has
been added to factor out “waiting for a transaction to confirm” when
handling moth local and remote cooperative closure flows.
Finally, we no longer delete the channel state within wipeChannel as
this will now be managed distinctly by callers.
The prior methods we employed to handle persistent connections could
result in the following situation: both peers come up, and
_concurrently_ establish connection to each other. With the prior
logic, at this point, both connections would be terminated as each peer
would go to kill the connection of the other peer. In order to resolve
this issue in this commit, we’ve re-written the way we handle
persistent connections.
The eliminate the issue described above, in the case of concurrent peer
connection, we now use a deterministic method to decide _which_
connection should be closed. The following rule governs which
connection should be closed: the connection of the peer with the
“smaller” public key should be closed. With this rule we now avoid the
issue described above.
Additionally, each peer now gains a peerTerminationWatcher which waits
until a peer has been disconnected, and then cleans up all resources
allocated to the peer, notifies relevant sub-systems of its demise, and
finally handles re-connecting to the peer if it's persistent. This
replaces the goroutine that was spawned in the old version of
peer.Disconnect().
This commit adds a new method to the peer struct: WaitForDisconnect().
This method is put in place to be used by wallers to synchronize the
ending of a peer’s lifetime. A follow up commit will utilize this new
method to re-write the way we handle persistent peer connections.
This commit modifies both readMessage and writeMessage to be further
message oriented. This means that message will be read and written _as
a whole_ rather than piece wise. This also fixes two bugs: the
readHandler could be blocked due to an sync read, and the writeHandler
would unnecessarily chunk up wire messages into distinct crypto
messages rather than writing it in one swoop.
Also with these series of changes, we’re now able to properly parse
messages that have been padded out with additional data as is allowed
by the current specification draft.
This commit implements the new ping/pong messages along with their new
behavior. The new set of ping/pong messages allow clients to generate
fake cover traffic as the ping messages tells the pong message how many
bytes to included and can also be padded itself.
This commit fixes a deadlock bug within the readHandler of the peer.
Previously, once a channel was pending opening, _no_ other message
would be processed by the readHandler as it would be blocked waiting
for the channel to open. On testnet this would be manifsted as a node
locking up, until the channel was detected as being open.
We fix this bug by tracking which channel streams are active. If a
channel stream is active, then we can send the update directly to it.
Otherwise, we launch a goroutine that’ll block until the channel is
open, then in a synchronized manner, update the channel stream as being
active and send the update to the channel.
This commit modifies the way the fundingManager tracks pending funding
workflows internally. Rather than using the old auto-incrementing
64-bit pending channel ID’s, we now use a 32-byte pending channel ID
which is generated using a CSPRG. Additionally, once the final funding
message has been sent, we now de-multiplex the FundingLocked message
according to the new Channel ID’s which replace the old ChannelPoint’s
and are exactly 32-bytes long.
This map was added very early on as a possible path to implement proper
retransmission. However, we now have a proper persistent retransmission
sub-system being proposed as a PR, therefore we no longer have any use
for this.
This commit patches a whole in our optimistic channel synchronization
logic by making the logCommitTimer a persistent ticker rather than one
that is activated after receiving a commitment, and disabled once we
send a new commitment ourself. In the setting of batched full-duplex
channel updates, the prior approach could at times result in a benign
state desync caused by one side being one commitment ahead of the other
because one of the nodes failed to, or was unable to provide the other
with a state update during the workflow.
This commit simplifies the channel state update handling by doing away
with the commitmentState.pendingUpdate method all together. The newly
added LightningChannel.FullySynced method replace the prior state and
also replaced all other uses of PendingUpdates.
By moving to using channel.FullySynced() we also eliminate class of
desynchronization error caused by a node failing to provide the other
side with the latest commitment state.
Change the name of fields of messages which are belong to the discovery
subsystem in a such way so they were the same with the names that are
defined in the specification.
Add usage of the 'discovery' package in the lnd, now discovery service
will be handle all lnwire announcement messages and send them to the
remote party.
This commit modifies the logic around the opening p2p handshake to
enforce a strict timeout around the receipt of the responding init
message. Before this commit, it was possible for the daemon and certain
RPC calls to deadlock as if a peer connected, but didn’t respond with
an init msg, then we’d be sitting there waiting for them to respond.
With this commit, we’ll now time out, kill the connection and then
possible attempt to re-connect if the connection was persistent.
This commit fixes a prior bug which would cause the set of HTLC’s on a
node’s commitment to potentially overflow if an HTLC was accepted or
attempted to be forwarded that but the commitment transaction over the
maximum allowed HTLC’s on a commitment transaction. This would cause
the HTLC to silently be rejected or cause a connection disconnect. In
either case, this would cause the two states to be desynchronized any
pending HTLC’s to be ignored.
We fix this issue by introducing the concept of a bounded channel,
which is a channel in which the number of items send and recevied over
the channel must be balanced in order to allow a new send to succeed
w/o blocking. We achieve this by using a chan struct{} as a semaphore
and decrement it each time a packet it sent, increasing the semaphore
one a packet is received. This creates a channel that we can use to
ensure the switch never sends more than N HTLC’s to a link before any
of the HTLC’s have been settled.
With this bug fix, it’s now once again possible to trigger sustained
bursts of payments through lnd nodes.
This commit fixes a bug in the opening handshake between to peers. The
bug would arise when on or many channels were active between a node
establishing a new connection. The htlcManager goroutines will
immediately attempt to extend the revocation window once they’re
active. However, at this time, the init message may not yet have been
sent as the two executions are in distinct goroutines.
We fix this bug by manually writing the init message directly to the
socket, rather than traveling through the queueHandler goroutine. With
this, we ensure that the init message is _always_ sent first.
This commit removes all instances of the fastsha256 library and
replaces it with the sha256 library in the standard library. This
change should see a number of performance improvements as the standard
library has highly optimized assembly instructions with use vectorized
instructions as the platform supports.
This commit modifies the peer struct’s loadActiveChannels method to
ensure that it skips over channels that aren’t actually active. This
fixes a bug that could possibly cause a peer to get stuck in a circular
wait loop and also de-sync a channel’s state machine.
When the funding transaction has been confirmed, the FundingLocked
message is sent by the peers to each other so that the existence of the
newly funded channel can be announced to the network.
This commit also removes the SingleFundingOpenProof message.
Once a channel funding process has advanced to the point of broadcasting
the funding transaction, the state of the channel should be persisted
so that the nodes can disconnect or go down without having to wait for the
funding transaction to be confirmed on the blockchain.
Previously, the finalization of the funding process was handled by a
combination of the funding manager, the peer and the wallet, but if
the remote peer is no longer online or no longer connected, this flow
will no longer work. This commit moves all funding steps following
the transaction broadcast into the funding manager, which is available
as long as the daemon is running.
github.com/lightningnetwork/lnd master ✗
0m ◒
▶ golint
htlcswitch.go:292:4: should replace numUpdates += 1 with numUpdates++
htlcswitch.go:554:6: var onionId should be onionID
htlcswitch.go:629:7: var onionId should be onionID
lnd_test.go:133:1: context.Context should be the first parameter of a
function
lnd_test.go:177:1: context.Context should be the first parameter of a
function
networktest.go:84:2: struct field nodeId should be nodeID
peer.go:1704:16: should omit 2nd value from range; this loop is
equivalent to `for invoice := range ...`
rpcserver.go:57:6: func newRpcServer should be newRPCServer
github.com/lightningnetwork/lnd master ✗
9m ⚑ ◒ ⍉
▶ go vet
features.go:12: github.com/lightningnetwork/lnd/lnwire.Feature
composite literal uses unkeyed fields
fundingmanager.go:380: no formatting directive in Errorf call
exit status 1
Previously, during the channel funding process, peers sent wire
messages using peer.queueMsg. By switching to server.sendToPeer, the
fundingManager is more resilient to network connection issues or system
restarts during the funding process. With server.sendToPeer, if a peer
gets disconnected, the daemon can attempt to reconnect and continue the
process using the peer’s public key ID.
This commit modifies a peer’s htlcManager goroutine in order to
properly implement the new state machine defined by the specification.
The major change to this new state machine is that we can no longer
have a limited number of unrevoked commitment states. As a result, we
no longer need to track how many outsanding changes we have, and only
need to track if we have a pending change or not. This simplifies the
logic a bit.
Additionally, when receive a new signature we FIRST send an
RevokeAndAck, THEN we if we need to send a signature in response or
not. This is the major change to the state machine from the PoV of the
htlcManager. Previously, the order was flipped.
In this commit the support for global and local feature vectors were
added in 'server' and 'peer' structures respectively. Also with commit
additional logic was added and now node waits to receive 'init'
lnwire.Message before sending/responding on any other messages.
This commit patches a bug in the code for handling a remote cooperative
channel closer. Previous if the region node didn’t know of the channel
which was being requested to close, then a panic would occur as the
entry read from the map would be nil.
To fix this bug, we now ensure that the channel exists before we
perform any actions on it. In a later commit which overhauls the
channel opening and closing to match that of the specification, this
logic will be modified to properly send an error message in response to
the failed channel closure.
This commit removes the BlockChainIO interface as a dependency to the
LightningChannel struct as the interface is no longer used within the
operation of the LightningChannel.
This commit refactors the queueHandler slightly to be more aggressive
when attempting to drain the pending message queue.
With this new logic the queueHandler will now attempt to _completely_
drain the pendingMsgs queue by sending it all to the writeHandler
_before_ attempting to accept new messages from the outside
sub-systems. The previous queueHandler was a bit more passive and would
result in messages sitting the the pendingMessage queue longer than
required.
This commit fixes a panic that can occur on 32-bit systems to the
misalignment of a int64/uint64 that’s used atomically using the atomic
package. To fix this issue, we now move all int64/unit64 variables that
are used atomically to the top of the struct in order to ensure
alignment.
Cleaning up some dead-code, satoshisSent/satoshisReceived have been
removed as they aren’t currently used. Instead those values are
accessed directly from the channel themselves.
This commit enhances the peer struct slightly be attempting to measure
the ping time to the remote node based on when we send a ping message
an dhow long it takes for us to receive a pong response.
The current method used to measure RTT is rather rough and could be
make much more accurate via the usage of an EMA/WMA and also via
attempting to measure processing time within the readMessage and
writeMessage functions.
The previous logging message was broken as that target chanpoint was
being overshadowed by the local variable declaration. The new logging
message will properly print the unknown channel point as well as the
peer who sent the message.
This commit adds a new error type to the `lnwire` package:
`UnknownMessage`. With this error we can catch the particular case of a
an error during reading that encounters a new or unknown message. When
we encounter this message in the peer’s readHandler, we can now
gracefully handle it by just skipping to the next message rather than
closing out the section entirely.
This puts us a bit closer to the spec, but not exactly as it has an
additional constraint that we can only ignore a new message if it has
an odd type. In a future release, we’ll modify this code to match the
spec as written.
This commit modifies the login of sent/recv’d wire messages in trace
mode in order utilize the more detailed, and automatically generated
logging statements using pure spew.Sdump.
In order to avoid the spammy messages due to spew printing the
btcec.S256() curve paramter within wire messages with public keys, we
introduce a new logging function to unset the curve paramter to it
isn’t printed in its entirety. To insure we don’t run into any panics
as a result of a nil pointer defense, we now copy the public keys
during the funding process so we don’t run into a panic due to
modifying a pointer to the same object.
This commit modifies the ConnectPeer RPC call and partitions the
behavior of the call into two scenarios: the connection should be
persistent which causes the call to be non-blocking, and the connection
should only attempt to connect once — which causes the call to be
blocking and report any error back to the caller.
As a result, the pendingConnRequest map and the logic around it is no
longer needed.
This commit adds a critical capability to the daemon: proper handling
of error cases that occur during the dispatch and forwarding of
multi-hop payments.
The following errors are now properly handled within the daemon:
* unknown payment hash
* unknown destination
* incorrect HTLC amount
* insufficient link capacity
In response to any of these errors, an lnwire.CanceHTLC message will be
back propagated from the spot of the error back to the source of the
payment. In the case of a locally initiated HTLC payment, the error
will be propagated back to the client which initiated the payment.
At this point, proper encrypted error replies as defined within the
spec are not yet implemented and will be fully implemented within a
follow up PR.
This commit makes a large number of minor changes concerning API usage
within the deamon to match the latest version on the upstream btcsuite
libraries.
The major changes are the switch from wire.ShaHash to chainhash.Hash,
and that wire.NewMsgTx() now takes a paramter indicating the version of
the transaction to be created.
This commit adds a much needed feature to the daemon, namely the
ability to force close a channel while the source daemon doesn’t have
an active connection to the counter party. Previously this wasn’t
possible as ALL channel closures were routed through the htlcSwitch
which is only able to trigger a channel closure if the peer is online.
To remedy this, if the closure type is “force” then, we now handle the
channel closure and related RPC streaming updates from the call handler
site of the RPC itself. As a result, there are now only two htlcSwitch
channel closure types: breach, and regular. The logic that’s now in the
rpcSever should likely be refactored into a distinct sub-system, but
getting the initial functionality in is important.
Finally, the channel breach integration test has been modified to skip
connection the peers before attempting the forceful channel closure of
a revoked state as the remote peer no longer needs to be online.
This commit modifies the interaction between the breachArbiter and the
peer struct such that the breachArbiter _always_ has the latest version
of a contract.
Prior to this commit once we connected out to a peer which we had an
active contract with and the breachArbiter was watching, we’d have two
copies of the channel in memory, instead of just a single one. This was
wasteful and caused some duplicated log messages due to two instances
of the channel being active.
This commit fully integrates the ChannelRouter of the new routing
package into the main lnd daemon.
A number of changes have been made to properly support the new
authenticated gossiping scheme.
Two new messages have been added to the server which allow outside
services to: send a message to all peers possible excluding one, and
send a series of messages to a single peer. These two new capabilities
are used by the ChannelRouter to gossip new accepted announcements and
also to synchronize graph state with a new peer on initial connect.
The switch no longer needs a pointer to the routing state machine as it
no longer needs to report when channels closed since the channel
closures will be detected by the ChannelRouter during graph pruning
when a new block comes in.
Finally, the funding manager now crafts the proper authenticated
announcement to send to the ChannelRouter once a new channel has bene
fully confirmed. As a place holder we have fake signatures everywhere
since we don’t properly store the funding keys and haven’t yet adapted
the Signer interface (or create a new one) that abstracts out the
process of signing a generic interface.
This commit adds some additional measures to ensure that a call to
queueMsg while the peer is shutting down won’t result in a potential
deadlock.
Currently, during shutdown the outgoingQueue channel is attempted to be
cleared by he writeHandler, however adding an additional select
statement serves as a mother layer of defense from nasty dead locks.
This commit revamps the way in bound and outbound connections are
handled within lnd. Instead of manually managing listening goroutines
and also outbound connections, all the duty is now assigned to the
connmgr, a new btcsuite package.
The connmgr now handles accepting inbound (brontide) connections and
communicates with the server to hand off new connections via a
callback. Additionally, any outbound connection attempt is now made
persistent by default, with the assumption that (for right now),
connections are only to be made to peers we wish to make connections
to. Finally, on start-up we now attempt to connection to all/any of our
direct channel counter parties in order to promote the availability of
our channels to the daemon itself and any RPC users.
This commit fixes a htlcSwitch bandwidth update bug that would manifest
when two sending daemons were started with the —debughtlc flag. The
invoice created for the debug HTLC has a value of 1000BTC. As a result
regardless of the amount sent, the switch’s state which be updated to
reflect that the daemon had just received a 1000BTC transfer.
To fix this bug, we now use the value of the HTLC itself for the
update, rather than the value if the invoice as they should match.
This commit introduces a new sub-system into the daemon whose job it is
to vigilantly watch for any potential channel breaches throughout the
up-time of the daemon. The logic which was moved from the utxoNursery
in a prior commit now resides within the breachArbiter.
Upon start-up the breachArbiter will query the database for all active
channels, launching a goroutine for each channel in order to be able to
take action if a channel breach is detected. The breachArbiter is also
responsible for notifying the htlcSwitch about channel breaches in
order to black-list the breached linked during any multi-hop forwarding
decisions.
This commit removes the previously added contract breach retribution
duties from the utxoNursery. Much of the code removed will instead be
moved to a new sub-system which continuously monitors the state of ALL
active contracts for their entire life time.
Use [33]byte for graph vertex representation.
Delete unneeded stuff:
1. DeepEqual for graph comparison
2. EdgePath
3. 2-thread BFS
4. Table transfer messages and neighborhood radius
5. Beacons
Refactor:
1. Change ID to Vertex
2. Test use table driven approach
3. Add comments
4. Make graph internal representation private
5. Use wire.OutPoint as EdgeId
6. Decouple routing messages from routing implementation
7. Delete Async methods
8. Delete unneeded channels and priority buffer from manager
9. Delete unneeded interfaces in internal graph realisation
10. Renamed ID to Vertex
This commit adds the necessary logic a peer’s htlcManger goroutine to
dispatch justice (sweeping all the funds in a channel) after it has
been detected that the counter-party has broadcast a prior revoked
commitment state.
The task to generate the justice transaction is handed off to the
utxoNursery. Once the nursery has finished its duty, the peer launches
a new goroutine which will delete the state of the channel once the
justice transaction has been confirmed within a block.
This commit refactors the peer struct slightly in order to implement
the new ping/pong workflow added in a prior commit. Pings are currently
sent every 30 seconds unconditionally.
This commit modifies both the Sphinx packet generation and processing
for recent updates to the API.
With the version 1 Sphinx specification, the payment hash is now
included in the MACs in order to thwart any potential replay attacks.
As a result, any attempts to replay previous HTLC packets MUST re-use
the same payment hash, meaning that the first-hop node can simply
settle the HTLC immediately, thwarting the attacker.
Additionally, within the Sphinx packet, each hop now gets a per-hop
payload which contains the necessary details (CTLV value, fee, etc) for
the node to successfully forward the payment. This per-hop payload is
protected by a packet-wide MAC.
This commit modifies the existing p2p connection authentication and
encryption scheme to now use the newly designed ‘brontide’
authenticated key agreement scheme for all connections.
Additionally, within the daemon lnwire.NetAddress is now used within
all peers which encapsulates host information, a node’s identity public
key relevant services, and supported bitcoin nets.