routing: dial back max concurrent block fetches

This commit reduces the number of concurrent validation operations the
router will perform when fully validating the channel graph. Reports
from several users indicate that GetInfo would hang for several minutes,
which is believed to be caused by attempting to validate massive amounts
of channels in parallel. This commit returns the limit back to its
original state before adding the batched gossip improvements.

We keep the 1000 concurrent validation request limit for
AssumeChannelValid, since we don't fetch blocks in that case. This
allows us to still keep the performance benefits on mobile/low-resource
devices.
This commit is contained in:
Conner Fromknecht 2021-02-17 18:13:29 -08:00
parent db5af6fc14
commit f7c5236bf6
No known key found for this signature in database
GPG Key ID: E7D737B67FA592C7

@ -3,6 +3,7 @@ package routing
import (
"bytes"
"fmt"
"runtime"
"sync"
"sync/atomic"
"time"
@ -914,7 +915,28 @@ func (r *ChannelRouter) networkHandler() {
// We'll use this validation barrier to ensure that we process all jobs
// in the proper order during parallel validation.
validationBarrier := NewValidationBarrier(1000, r.quit)
//
// NOTE: For AssumeChannelValid, we bump up the maximum number of
// concurrent validation requests since there are no blocks being
// fetched. This significantly increases the performance of IGD for
// neutrino nodes.
//
// However, we dial back to use multiple of the number of cores when
// fully validating, to avoid fetching up to 1000 blocks from the
// backend. On bitcoind, this will empirically cause massive latency
// spikes when executing this many concurrent RPC calls. Critical
// subsystems or basic rpc calls that rely on calls such as GetBestBlock
// will hang due to excessive load.
//
// See https://github.com/lightningnetwork/lnd/issues/4892.
var validationBarrier *ValidationBarrier
if r.cfg.AssumeChannelValid {
validationBarrier = NewValidationBarrier(1000, r.quit)
} else {
validationBarrier = NewValidationBarrier(
4*runtime.NumCPU(), r.quit,
)
}
for {