In this commit, we add a timeout within the writeMessage method when we go to write to the socket. We do this as otherwise, if the other peer is blocked for some reason, we'll never actually unblock ourselves, which may cause issues in other sub-systems waiting on this write call. For now, we use a value of 10 seconds, and will adjust in the future if we deem this time period too short.
In this commit, we add a new package level mutex. Each time we decode a
new set of chan IDs w/ zlib, we also grab this mutex. The purpose here
is to ensure that we only EVER allocate the maxZlibBufSize globally
across all peers. Otherwise, it may be possible for us to allocate up to
64 MB for _each_ peer, exposing an easy OOM attack vector.
In this commit, we implement zlib encoding and decoding for the channel
range queries. Notably, we utilize an io.LimitedReader to ensure that we
can enforce a hard cap on the total number of bytes we'll ever allocate
in a decoding attempt.
In this commit, we modify the removeLink method to be more asynchronous.
Before this commit, we would attempt to block until the peer exits.
However, it may be the case that at times time, then target link is
attempting to forward a batch of packets to the switch (forwardBatch).
Atm, this method doesn't pass in an external context/quit, so we can't
cancel this message/request. As a result, we'll now ensure that
`removeLink` doesn't block, so we can resume the switch's main loop as
soon as possible.
This commit handles a racy condition within the breacharbiter's justice
tx procedure. For backends that have no mempool we would check if an
HTLC output was spent and then try broadcasting the justice tx, but this
would fail since we wouldn't detect the spend before it was in a block.
The result was that we would continuously attempt to broadcast the
transaction, effectively ending up in an endless (until the second-level
tx actually comfirmed) loop.
Instead we now register for spend notifications in case broadcasting the
transaction fails, and then wait for any of the notifications to be
sent before trying again.
This is a necessary step to be able to make lnd work well only with
confimed transactions, and was a better solution than introducing
timeouts within the broadcast loop (which complicates integration
tests).