Commit Graph

272 Commits

Author SHA1 Message Date
Yaacov Akiba Slama
17223c1215 peer: add write timeout within writeMessage
In this commit, we add a timeout within the writeMessage method when we go to write to the socket. We do this as otherwise, if the other peer is blocked for some reason, we'll never actually unblock ourselves, which may cause issues in other sub-systems waiting on this write call. For now, we use a value of 10 seconds, and will adjust in the future if we deem this time period too short.
2018-06-26 17:27:22 -07:00
Wilmer Paulino
b1ba83bf2b
peer: prevent processing close msg if channel is not found 2018-06-25 13:13:12 -07:00
Johan T. Halseth
55c8741e42
peer: don't stop nil channel 2018-06-19 12:48:11 +01:00
Wilmer Paulino
90a50bd893
peer: remove no longer needed block epoch 2018-06-13 19:23:44 -07:00
Wilmer Paulino
8198466972
multi: move block epochs dependency from links to switch
In this commit, we move the block height dependency from the links in
the switch to the switch itself. This is possible due to a recent change
on the links no longer depending on the block height to update their
commitment fees.

We'll now only have the switch be alerted of new blocks coming in and
links will retrieve the height from it atomically.
2018-06-13 17:41:21 -07:00
Wilmer Paulino
4cc60493d2
peer+htlcswitch: randomize link commitment fee updates
In this commit, we modify the behavior of links updating their
commitment fees. Rather than attempting to update the commitment fee for
each link every time a new block comes in, we'll use a timer with a
random interval between 10 and 60 minutes for each link to determine
when to update their corresponding commitment fee. This prevents us from
oscillating the fee rate for our various commitment transactions.
2018-06-13 17:41:01 -07:00
Olaoluwa Osuntokun
e60d2b774a
htlcswitch: in event of duplicate link add, prefer newer link 2018-06-12 00:44:30 -07:00
Olaoluwa Osuntokun
e1d8c37708
peer: only add an active link to the channelManager if addPeer succeeds
In this commit we fix an existing bug which could cause internal state
inconsistency between then switch, funding manager, and the peer. Before
this commit, we would _always_ add a new channel to the channelManager.
However, due to recent logic, it may be the case that this isn't the
channel that will ultimately reside in the link. As a result, we would
be unable to process incoming FundingLocked messages properly, as we
would mutate the incorrect channel in memory.

We remedy this by moving the inserting of the new channel into the
activeChannels map until the end of the loadActiveChannels method, where
we know that this will be the link that persists.
2018-06-12 00:44:26 -07:00
Conner Fromknecht
edf08458c1
peer: changes to satisfy lnpeer.Peer
This commit adds:
 - variadic SendMessage method (w/ sync bool reorder)
 - passes peer object directly to ProcessRemoteAnnouncment
 - adds IdentityKey() method for lnpeer.Peer
2018-06-08 13:47:57 -07:00
maurycy
3be08e69cf multi: 64bit aligment of atomic vars on arm/x86-32 2018-06-04 20:02:34 -07:00
Olaoluwa Osuntokun
210e28a483
peer: add messages summaries for the new gossip query messages 2018-05-31 16:30:56 -07:00
Olaoluwa Osuntokun
40bb19eb6e
peer: dispath new gossip query messages to the gossiper 2018-05-31 16:30:55 -07:00
Olaoluwa Osuntokun
bfcdbe2204 peer: when failing link, depend on server wg, not peer
In this commit, we fix a recently introduced bug. The issue is that
while we're failing the link, the peer we're attempting to force close
on may disconnect. As a result, if the peerTerminationWatcher exits
before we can add to the wait group (it's waiting on that), then we'll
run into a panic as we're attempting to increment the wait group while
another goroutine is calling wait.

The fix is to first check that the server isn't shutting down, and then
use the server's wait group rather than the peer to synchronize
goroutines.

Fixes #1285.
2018-05-25 20:36:40 -07:00
Johan T. Halseth
3b2fd32523
peer: populate OnChannelFailure in link config
This commit makes the peer aware of the LinkFailureErrors that can
happen during link operation, and making it start a goroutine to
properly remove the link and force close the channel.
2018-05-25 06:58:23 +02:00
Johan T. Halseth
4836a25c98
peer: move link creation into new method 2018-05-25 06:54:05 +02:00
Johan T. Halseth
bb611e065d
peer: don't pass closeCtx to chanCloser 2018-05-22 12:06:33 +02:00
Olaoluwa Osuntokun
f0050e7ffb
Merge pull request #1201 from lightningnetwork/peer-async-disconnect
server: Peer Async Disconnect w/ Scheduled peerConnected
2018-05-10 17:05:16 -07:00
Olaoluwa Osuntokun
b271ed5ffb
Merge pull request #1199 from cfromknecht/queue-msg-err-chans
peer/server: Remove Broadcast err chans
2018-05-10 16:55:57 -07:00
Conner Fromknecht
a2f7170ff0
peer: make Disconnect async, block on WaitForDisconnect 2018-05-08 16:37:14 -07:00
Conner Fromknecht
0691b21a30
peer: improves disconnect handling
This commit attempts to resolve some potential deadlock
scenarios during a peer disconnect.

Currently, writeMessage returns a nil error when disconnecting.
This should have minimal impact on the writeHanlder, as the
subsequent loop selects on the quit chan, and will cause it to
exit. However, if this happens when sending the init message,
the Start() method will attempt to proceed even though the peer
has been disconnected.

In addition, this commit changes the behavior of synchronous
write errors, by using a non-blocking select. Though unlikely,
this prevents any cases where multiple errors are returned, and
the errors are not being pulled from the other side of the errChan.
This removes any naked sends on the errChan from stalling the peer's
shutdown.
2018-05-08 16:35:49 -07:00
Olaoluwa Osuntokun
72f48b6abe
htlcswitch+server: ensure we always send an update w/ a TempChannelFailure
In this commit, we ensure that any time we send a TempChannelFailure
that's destined for a multi-hop source sender, then we'll always package
the latest channel update along with it.
2018-05-08 13:00:28 -04:00
Olaoluwa Osuntokun
d72f28839d
Merge pull request #1104 from halseth/chainwatcher-handoff-race
Fix chainwatcher handoff race
2018-05-03 17:18:31 -07:00
Olaoluwa Osuntokun
ecfde2e85f
Merge pull request #1149 from cfromknecht/trim-pending-htlc-index
Trim Open Circuits Using HTLC Index of Pending Commitments
2018-05-03 16:45:29 -07:00
Olaoluwa Osuntokun
5f059e74cb
peer: ensure msgConsumer sets the shutdown variable on exit
In this commit, we fix a bug that could at times cause a deadlock when a
peer is attempting to disconnect. The issue was that when a peer goes to
disconnect, it needs to stop any active msgStream instances. The Stop()
method of the msgStream would block until an atomic variable was set to
indicate that the stream had fully exited. However, in the case that we
disconnected lower in the msgConsumer loop, we would never set the
streamShutdown variable, meaning that msgStream.Stop() would never
unblock.

The fix for this is simple: set the streamShutdown variable within the
quit case of the second select statement in the msgConsumer goroutine.
2018-05-03 15:45:22 -07:00
Conner Fromknecht
701d37725c
peer: extract hodl mask, remove htlchodl mode 2018-05-02 00:18:51 -07:00
Johan T. Halseth
08f1a3689d
peer: don't pass bool to SubscribeChannelEvents 2018-05-02 08:43:32 +02:00
Johan T. Halseth
bd4e717971
peer: don't load channels that have had commitment broadcasted 2018-04-25 09:37:24 +02:00
practicalswift
663c396235 multi: fix a-vs-an typos 2018-04-17 19:02:04 -07:00
Olaoluwa Osuntokun
ffabb17ce6
peer: use new fetchLastChanUpdate method to populate the ChannelLinkConfig 2018-04-06 14:52:01 -07:00
Olaoluwa Osuntokun
7cbe78eeee
peer: re-use a static writeBuf within writeMessage optimize memory usage
In this commit, we might a very small change to the way writing messages
works in the peer, which should have large implications w.r.t reducing
memory usage amongst chatty nodes.

When profiling the heap on one of my nodes earlier, I noticed this
fragment:
```
Showing top 20 nodes out of 68
      flat  flat%   sum%        cum   cum%
         0     0%     0%    75.53MB 54.61%  main.(*peer).writeHandler
   75.53MB 54.61% 54.61%    75.53MB 54.61%  main.(*peer).writeMessage
```

Which points to an inefficiency with the way we handle allocations when
writing new messages, drilling down further we see:
```
(pprof) list writeMessage
Total: 138.31MB
ROUTINE ======================== main.(*peer).writeMessage in /root/go/src/github.com/lightningnetwork/lnd/peer.go
   75.53MB    75.53MB (flat, cum) 54.61% of Total
         .          .   1104:   p.logWireMessage(msg, false)
         .          .   1105:
         .          .   1106:   // As the Lightning wire protocol is fully message oriented, we only
         .          .   1107:   // allows one wire message per outer encapsulated crypto message. So
         .          .   1108:   // we'll create a temporary buffer to write the message directly to.
   75.53MB    75.53MB   1109:   var msgPayload [lnwire.MaxMessagePayload]byte
         .          .   1110:   b := bytes.NewBuffer(msgPayload[0:0:len(msgPayload)])
         .          .   1111:
         .          .   1112:   // With the temp buffer created and sliced properly (length zero, full
         .          .   1113:   // capacity), we'll now encode the message directly into this buffer.
         .          .   1114:   n, err := lnwire.WriteMessage(b, msg, 0)
(pprof) list writeHandler
Total: 138.31MB
ROUTINE ======================== main.(*peer).writeHandler in /root/go/src/github.com/lightningnetwork/lnd/peer.go
         0    75.53MB (flat, cum) 54.61% of Total
         .          .   1148:
         .          .   1149:                   // Write out the message to the socket, closing the
         .          .   1150:                   // 'sentChan' if it's non-nil, The 'sentChan' allows
         .          .   1151:                   // callers to optionally synchronize sends with the
         .          .   1152:                   // writeHandler.
         .    75.53MB   1153:                   err := p.writeMessage(outMsg.msg)
         .          .   1154:                   if outMsg.errChan != nil {
         .          .   1155:                           outMsg.errChan <- err
         .          .   1156:                   }
         .          .   1157:
         .          .   1158:                   if err != nil {
```

Ah hah! We create a _new_ buffer each time we want to write a message
out. This is unnecessary and _very_ wasteful (as seen by the profile).
The fix is simple: re-use a buffer unique to each peer when writing out
messages. Since we know what the max message size is, we just allocate
one of these 65KB buffers for each peer, and keep it around until the
peer is removed.
2018-04-06 12:55:17 -07:00
Olaoluwa Osuntokun
ca9174e166
peer: extend SendMessage to allow callers to block until msg is sent 2018-04-04 17:43:57 -07:00
Olaoluwa Osuntokun
447a031435
peer: reject remote closes with active HTLCs
In this commit, we follow up to the prior commit by ensuring we won't
accept a co-op close request for a chennel with active HTLCs. When
creating a chanCloser for the first time, we'll check the set of HTLC's
and reject a request (by sending a wire error) if the target channel
still as active HTLC's.
2018-03-30 13:06:57 -07:00
Olaoluwa Osuntokun
e7d66e1dfd
peer: don't d/c peer if we encounter lnwire.ErrUnknownAddrType
In this commit, we fix a minor deviation in our implementation from the
specification. Before if we encountered an unknown error type, we would
disconnect the peer. Instead, we’ll now just continue along parsing the
remainder of the messages. This was flared up recently by some
c-lightning related incompatibilities that emerged on main net.
2018-03-23 15:49:33 -07:00
Olaoluwa Osuntokun
fda3b871c1
peer: ensure we stop the channel if error happens in loadActiveChannels
In this commit, we fix a goroutine leak that could occur if while we
were loading an error occurred in any of the steps after we created the
channel object, but before it was actually loaded in to the script. If
an error occurs at any step, we ensure that we’ll stop toe channel.
Otherwise, the sigPool goroutines would still be lingering and never be
stopped.
2018-03-19 19:14:55 -07:00
Conner Fromknecht
c1f0c4ffda
peer: DecodeOnionObfuscator -> ExractErrorEncrypter 2018-03-13 16:33:29 -07:00
Johan T. Halseth
a6c2550404
peer: track failed channels
This commit adds a set used to track channels we consider failed. This
is done to ensure we don't end up in a connect/disconnect loop when we
attempt to re-sync the channel state of a failed channel with a peer.
2018-03-13 11:11:17 +01:00
Olaoluwa Osuntokun
53045450ad
Merge pull request #823 from Roasbeef/chan-stream-buf-limit
peer: modify the msgStream to not buffer messages off the wire indefi…
2018-03-12 19:39:48 -07:00
Olaoluwa Osuntokun
c285bb5814 htlcswitch+peer: remove DecodeHopIterator from ChannelLinkConfig
In this commit, we remove the DecodeHopIterator method from the
ChannelLinkConfig struct. We do this as we no longer use this method,
since we only ever use the DecodeHopIterators method now.
2018-03-12 18:58:08 -07:00
Olaoluwa Osuntokun
60c8257c3c
peer: modify the msgStream to not buffer messages off the wire indefinitely
In this commit, we modify the msgStream struct to ensure that it has a
cap at which it’ll continue to buffer messages. Currently we have two
msgStream structs per peer: the first for the discovery messages, and
the second for any messages that modify channel state. Due to
inefficiencies in the current protocol for reconciling graph state upon
connection (just dump the entire damn thing), when a node first starts
up, this can lead to very high memory usage as all peers will
concurrently send their initial message dump which can be in the
thousands of messages on testate.

Our fix is simple: make the message stream into a _bounded_ message
stream. The newMsgStream function now has a new argument: bufSize.
Internally, we’ll take this bufSize and create more or less an internal
semaphore for the producer. Each time the producer gets a new message,
it’ll try and read an item from the channel. If the queue still has
size, then this will succeed immediately. If not, then we’ll block
until the consumer actually finishes processing a message and then
signals by sending a new item into the channel.

We choose an initial value of 1000. This was chosen as there’s already
a max limit of outstanding adds on the commitment, and a value of 1000
should allow any incoming messages to be safely flushed and processed
by the gossiper.
2018-03-12 16:34:59 -07:00
Johan T. Halseth
80ef16e853
peer: hand lnwire.Error to fndgMngr or link depending on chanID 2018-03-11 17:21:23 +01:00
Conner Fromknecht
972d238f04
peer: remove Switch from channel link config 2018-03-09 21:18:15 -08:00
Conner Fromknecht
d468a36d69
peer: pass unsafe-replay to link config 2018-03-09 21:18:15 -08:00
Conner Fromknecht
8d0e3dc467
peer: adds FwdPkgGCTicker to channel configs 2018-03-09 21:18:14 -08:00
Conner Fromknecht
0e4be6a04a
peer: init link with batched sphinx processing 2018-03-09 21:18:14 -08:00
Johan T. Halseth
b9d1eceda3
peer: use EstimateFeePerVSize 2018-02-26 22:42:26 +01:00
MeshCollider
2c2ed3c6a9 multi: Unify use of NodeKey in log messages 2018-02-19 17:48:39 -08:00
MeshCollider
915c4201b9 multi: remove internal peer_id usage 2018-02-19 17:48:39 -08:00
practicalswift
b8e1351cf3 multi: fix some recently introduced typos 2018-02-18 15:27:29 -08:00
Olaoluwa Osuntokun
2e090ee2ab
peer: only return a channel snapshot if the channel can route
In this commit, we fix a slight miscalculation within the GetInfo call.
Before this commit, we would list any channel that the peer knew of as
active, instead of those which are, well, actually *active*. We fix
this by skipping any channels that we don’t have the remote revocation
for.
2018-02-08 19:40:54 -08:00
Olaoluwa Osuntokun
22951cb364
lnd: account for new lnwire.Sig API and channeldb API changes 2018-02-06 20:14:33 -08:00