In this commit, we add a 5 minute idle timer to
the write handler. After catching the write
timeouts, it's been observed that some connections
have trouble reading a message for several hours.
This typically points to a deeper issue w/ the peer
or, e.g. the remote peer switched networks. This now
mirrors the idle timeout used in the read handler,
such that we will disconnect a peer if we are unable
to send or receive a message from the peer after 5
minutes.
We also modify the readHandler to drain its
idleTimer's channel in the even that the timer had
already fired, but we successfully sent the message.
This commit reduces the peer's write timeout to 5s.
Now that the peer catches write timeouts and doesn't
disconnect, this will ensure we spend less time blocking
in the write pool in case others also need to access the
workers concurrently. Slower peers will now only block
for 5s, after every reattempt w/ exponential backoff.
This commit modifies the writeHandler to catch timeout
errors, and retry writes to the socket after a small
backoff, which increases exponentially from 5s to 1m.
With the growing channel graph size, some lower-powered
devices can be slow to pull messages off the wire during
validation. The current behavior will cause us to
disconnect the peer, and resend all of the messages that
the remote peer is slow to validate. Catching the timeout
helps in preventing such expensive reconnection cycles,
especially as the network continues to grow.
This is also a preliminary step to reducing the
write timeout constant. This will allow concurrent usage
of the write pools w/out devoting excessive amounts of
time blocking the pool for slow peers.
This commit modifies the breach arbiter to monitor
all breached inputs for spends and remove them from
the set of inputs to be swept if they are spent to
a terminal state. Prior, we would only watch for
spends on htlcs that may need to transition and
sweep the corresponding second-level htlc.
With these changes, we will no monitor commitment
outputs for spends, as well as spends from the
second level htlcs themselves. If either of these
is detected, we remove them from the set of inputs
to sweep via the justice transaction because there
is nothing the breach arbiter can do.
This functionality will be needed when adding
watchtower support, as the breach arbiter must
detect the case when the tower sweeps on behalf of
the user and stop pursuing the sweep itself. In
addition, this now properly handles the potential
case where somehow the remote party is able to sweep
the their commitment or second-level htlc to their
wallet, and prevent the breach arbiter from trying
to sweep the outputs as it would now.
Note that in the latter event, the internal
accounting may still be incorrect, as it is assumed
that all breached funds return to the victim.
However, these issues will deferred and fixed at a
later date, as the more crucial aspect is that the
breach arbiter doesn't blow up as a result of towers
sweeping channels.
In this commit, we convert all the `Update` calls which are serial, to
use `Batch` calls which are optimistically batched together for
concurrent writers. This should increase performance slightly during the
initial graph sync, and also updates at tip as we can coalesce more of
these individual transactions into a single transaction.
In this commit, we simplify the existing `htlcTImeoutResolver` with some
newly refactored out methods from the `htlcTimeoutContestResolver`. The
resulting logic is easier to follow as it's more linear, and only deals
with spend notifications rather than both spend _and_ confirmation
notifications.
This commit moves the logic querying the available heuristics out of the
autopilot agent and into the autopilot manager. This lets us query the
heuristic without the autopilot agent being active.
If called without the agent being active, the current set of channels
will be considered by the heuristics. If the agent is active also the
pending channels will be considered.