In this commit, we modify the way we handle FeeInsufficientErrors to
more aggressively route around nodes that repeatedly return the same
error to us. This will ensure we skip older nodes on the network which
are running a buggier older version of lnd. Eventually most nodes will
upgrade to this new version, making this change less needed.
We also update the existing test to properly use a multi-hop route to
ensure that we route around the offending node.
In this commit, we update the testUpdateChannelPolicy to exercise the
recent set of changes within the switch. If one applies this test to a
fresh branch (without those new changes) it should fail. This is due to
the fact that before, Bob would attempt to apply the constraints of the
incoming link (which we updated) instead of the outgoing link. With the
recent set of changes, the test now properly passes.
In this commit, we fix the TestUpdateForwardingPolicy test case after
the recent modification in the way we handling validating constraints
within the link. After the recent set of changes, Bob will properly use
his outgoing link to validate the set of fee related constraints rather
than the incoming link. As a result, we need to modify the second
channel link, not the first for the test to still be applicable.
In this commit, we simplify the switch's code a bit. Rather than having
a set of channels we use to mutate or query for the set of current
links, we'll instead now just use a mutex to guard a set of link
indexes. This serves to simplify the ode, and also make it such that we
don't need to block forwarding in order to add/remove a link.
In this commit, we fix a very old, lingering bug within the link. When
accepting an HTLC we are meant to validate the fee against the
constraints of the *outgoing* link. This is due to the fact that we're
offering a payment transit service on our outgoing link. Before this
commit, we would use the policies of the *incoming* link. This would at
times lead to odd routing errors as we would go to route, get an error
update and then route again, repeating the process.
With this commit, we'll properly use the incoming link for timelock
related constraints, and the outgoing link for fee related constraints.
We do this by introducing a new HtlcSatisfiesPolicy method in the link.
This method should return a non-nil error if the link can carry the HTLC
as it satisfies its current forwarding policy. We'll use this method now
at *forwarding* time to ensure that we only forward to links that
actually accept the policy. This fixes a number of bugs that existed
before that could result in a link accepting an HTLC that actually
violated its policy. In the case that the policy is violated for *all*
links, we take care to return the error returned by the *target* link so
the caller can update their sending accordingly.
In this commit, we also remove the prior linkControl channel in the
channelLink. Instead, of sending a message to update the internal link
policy, we'll use a mutex in place. This simplifies the code, and also
adds some necessary refactoring in anticipation of the next follow up
commit.
In this commit, we fix an existing deadlock in the
processChanPolicyUpdate method. Before this commit, within
processChanPolicyUpdate, we would directly call updateChannel *within*
the ForEachChannel closure. This would at times result in a deadlock, as
updateChannel will itself attempt to create a write transaction in order
to persist the newly updated channel.
We fix this deadlock by simply performing another loop once we know the
set of channels that we wish to update. This second loop will actually
update the channels on disk.
In this commit, we might a very small change to the way writing messages
works in the peer, which should have large implications w.r.t reducing
memory usage amongst chatty nodes.
When profiling the heap on one of my nodes earlier, I noticed this
fragment:
```
Showing top 20 nodes out of 68
flat flat% sum% cum cum%
0 0% 0% 75.53MB 54.61% main.(*peer).writeHandler
75.53MB 54.61% 54.61% 75.53MB 54.61% main.(*peer).writeMessage
```
Which points to an inefficiency with the way we handle allocations when
writing new messages, drilling down further we see:
```
(pprof) list writeMessage
Total: 138.31MB
ROUTINE ======================== main.(*peer).writeMessage in /root/go/src/github.com/lightningnetwork/lnd/peer.go
75.53MB 75.53MB (flat, cum) 54.61% of Total
. . 1104: p.logWireMessage(msg, false)
. . 1105:
. . 1106: // As the Lightning wire protocol is fully message oriented, we only
. . 1107: // allows one wire message per outer encapsulated crypto message. So
. . 1108: // we'll create a temporary buffer to write the message directly to.
75.53MB 75.53MB 1109: var msgPayload [lnwire.MaxMessagePayload]byte
. . 1110: b := bytes.NewBuffer(msgPayload[0:0:len(msgPayload)])
. . 1111:
. . 1112: // With the temp buffer created and sliced properly (length zero, full
. . 1113: // capacity), we'll now encode the message directly into this buffer.
. . 1114: n, err := lnwire.WriteMessage(b, msg, 0)
(pprof) list writeHandler
Total: 138.31MB
ROUTINE ======================== main.(*peer).writeHandler in /root/go/src/github.com/lightningnetwork/lnd/peer.go
0 75.53MB (flat, cum) 54.61% of Total
. . 1148:
. . 1149: // Write out the message to the socket, closing the
. . 1150: // 'sentChan' if it's non-nil, The 'sentChan' allows
. . 1151: // callers to optionally synchronize sends with the
. . 1152: // writeHandler.
. 75.53MB 1153: err := p.writeMessage(outMsg.msg)
. . 1154: if outMsg.errChan != nil {
. . 1155: outMsg.errChan <- err
. . 1156: }
. . 1157:
. . 1158: if err != nil {
```
Ah hah! We create a _new_ buffer each time we want to write a message
out. This is unnecessary and _very_ wasteful (as seen by the profile).
The fix is simple: re-use a buffer unique to each peer when writing out
messages. Since we know what the max message size is, we just allocate
one of these 65KB buffers for each peer, and keep it around until the
peer is removed.
This commit fixes a bug within the funding manager, where we would use
the wrong min_htlc_value parameter. Instead of attributing the custom
passed value for MinHtlc to the remote's constraints, we would add it to
our own constraints.
This commit adds TestFundingManagerCustomChannelParameters, which checks
that custom channel parameters specified at channel creation is
preserved and recorded correctly on both sides of the channel.
This commit fixes a bug that would cause the local and remote commitment
to be incompatible when using custom remote CSV delay when opening a
channel. This would happen because we wouldn't store the CSV value
before we received the FundingAccept message, and here we would use the
default value.
This commit fixes this by making the csv value part of the
reservationWithCtx struct, such that it can be recorded for use when the
FundingAccept msg comes back.
This commit renames ForceCloseSummary to LocalForceCloseSummary, and
adds a new method NewLocalForceCloseSummary that can be used to derive a
LocalForceCloseSummary if our commitment transaction gets confirmed
in-chain. It is meant to accompany the NewUnilateralCloseSummary method,
which is used for the same purpose in the event of a remote commitment
being seen in-chain.
In this commit, we introduce the ability for the different ChainNotifier
implements to send incremental updates to the subscribers of transaction
confirmations. These incremental updates represent how many
confirmations are left for the transaction to be confirmed. They are
sent to the subscriber at every new height of the chain.
In this commit, we avoid storing extra copies of a transaction when
multiple clients register to be notified for the same transaction. We do
this by using a set, which only stores unique elements.
In this commit, we add a new Updates channel to our ConfirmationEvent
struct. This channel will be used to deliver updates to a subscriber of
a confirmation notification. Updates will be delivered at every
incremental height of the chain with the number of confirmations
remaining for the transaction to be considered confirmed by the
subscriber.
It is better to replace bash shell with potentially long-running
last script command. This way the running command will receive all
potential unix process signals directly.
A concrete example which motivated this change:
Exec of btcd is needed for graceful shutdown of btcd during
`docker-compose down`. Docker Compose properly sends this signal to our
start-btcd.sh bash shell but it is not further signalled to the running
btcd process. Docker Compose then kills whole container forcefully after
some timeout.
An alternative solution would be to trap SIGTERM in our bash script and
forward it to running btcd. Which would be IMO ugly and error prone.
In this commit, we fix an existing bug in the NewBreachRetribution
method. Rather than creating the slice to the proper length, we instead
now create it to the proper _capacity_. As we'll now properly filter out
any dust HTLCs, before this commit, even if no HTLCs were added, then
the slice would still have a full length, meaning callers could actually
interact with _blank_ HtlcRetribution structs.
The fix is simple: create the slice with the proper capacity, and append
to the end of it.
In this commit, we fix an existing within lnd. Before this commit,
within NewBreachRetribution the order of the keys when generating the
sender HTLC script was incorrect. As in this case, the remote party is
the sender, their key should be first. However, the order was swapped,
meaning that at breach time, our transaction would be rejected as it had
the incorrect witness script.
The fix is simple: swap the ordering of the keys. After this commit, the
test extension added in the prior commit now passes.
In this commit, we extend the testRevokedCloseRetributionRemoteHodl so
that the final broadcast revoked transaction has incoming *and* outgoing
HTLC's. As is, this test fails as there's a lingering bug in the way we
generate htlc resolutions. A follow up commit will remedy this issue.
In this commit we add a new error: InvalidHtlcSigError. This error will
be returned when we're unable to validate an HTLC signature sent by the
remote party. This will allow other nodes to more easily debug _why_ the
signature was rejected.
In this commit, we fix a slight bug in lnd. Before this commit, we would
send the error to the remote peer, but in an async manner. As a result,
it was possible for the connections to be closed _before_ the error
actually reached the remote party. The fix is simple: wait for the error
to be returned when sending the message. This ensures that the error
reaches the remote party before we kill the connection.
In this commit, add a new argument to the SendMessage method to allow
callers to request that the method block until the message has been sent
on the socket to the remote peer.