Commit Graph

11181 Commits

Author SHA1 Message Date
Olaoluwa Osuntokun
66754b6e71
build: update to latest btcd with connmgr bug fix 2019-04-09 19:56:25 -07:00
Olaoluwa Osuntokun
3e5a6f1022
Merge pull request #2905 from cfromknecht/split-chunk-size
discovery: make batch size distinct from chunk size, reduce to 500
2019-04-09 19:27:20 -07:00
Conner Fromknecht
4a755435e6
channeldb/graph: skip unknown edges in FetchChanInfos
This commit modifies FetchChanInfos to skip any channels that are not in
the graph at the time of the call. Currently the entire call will fail
if the edge is not found, which stalls a gossip sync in the following
scenario:

 1. Remote peer queries for a channel range
 2. We return the set of channel ids in that range
 3. A channel from that set is removed from the graph, e.g. via close.
 4. Remote peer queries for removed edge, causing the query to fail.

To remedy this, we will now skip any edges that are not known in the
database at the time of the query. This prevents the syncer state
machines from halting, which otherwise could only be resolved by
disconnecting and reconnecting.
2019-04-09 17:35:58 -07:00
Joost Jager
a2aeb646e7
Merge pull request #2908 from joostjager/chan-arb-logging
cnct+invoices: improve logging
2019-04-09 22:34:12 +02:00
Johan T. Halseth
2782baf793
Merge pull request #2910 from halseth/neutrino-chainservice-stoporder
chainregistry: stop Neutrino before closing DB
2019-04-09 09:49:32 +02:00
Olaoluwa Osuntokun
78ffb5bb5f
Merge pull request #2915 from joostjager/hodl-restart-quick-fix
htlcswitch: do not check final cltv for accepted invoices
2019-04-08 13:37:58 -07:00
Joost Jager
038ce342b3
htlcswitch: do not check final cltv for accepted invoices 2019-04-08 18:16:21 +02:00
Joost Jager
33a1904dc9
invoices: unify invoice log statements 2019-04-08 13:10:51 +02:00
Johan T. Halseth
9cc0ea93b2
chainregistry: stop Neutrino before closing DB
To avoid the ChainService still attempting to access the database when
it gets closed, re-order the stop order such that the Chainservice gets
stopped before closing the DB.
2019-04-08 12:48:51 +02:00
Johan T. Halseth
9e67f25957
chainregistry: only stop ChainService after successful start 2019-04-08 12:48:42 +02:00
Joost Jager
86eb0a3383
cnct: log go to chain reason
This commit adds logging of the reason to go to chain for a channel.
This can help users to find out the reason why a channels forced closed.

To get all go to chain reasons, an optimization to break early is
removed. This optimization was not significant, because the normal flow
already examined all htlcs. In the exceptional case where we need to go
to chain, it does not weigh up against logging all go to chain reasons.
2019-04-08 10:34:41 +02:00
Olaoluwa Osuntokun
1fea5b09b2
Merge pull request #2902 from cfromknecht/restore-lazy-gossip-query
discovery/sync_manager: restore lazy gossip sends
2019-04-06 18:05:11 -07:00
Conner Fromknecht
a4b4fe666a
discovery: make batch size distinct from chunk size, reduce to 500
This commit reduces the number of channels a syncer will request from
the remote node in a single QueryShortChanIDs message. The current size
is derived from the chunkSize, which is meant to signal the maximum
number of short chan ids that can fit in a single ReplyChannelRange
message. For EncodingSortedPlain, this number is 8000, and we use the
same number to dictate the size of the batch from the remote peer.

We modify this by introducing a separately configurable batchSize, so
that both can be tuned independently. The value is chosen to reduce the
amount of buffering the remote party will perform, only requiring them
queue 500 responses, as opposed to 8000. In turn, this reduces larges
spikes in allocation on the remote node at the expense of a few extra
round trips for the control messages. However, will be negligible since
the control messages are much smaller than the messages being returned.
2019-04-06 15:27:26 -07:00
Conner Fromknecht
9df6af237e
discovery/sync_manager: restore lazy gossip sends 2019-04-06 03:32:03 -07:00
Olaoluwa Osuntokun
66189f1a21
lnd: increase neutrino ban duration to 48hrs from 5s
In this commit, we fix an oversight that overrode the default ban
duration for neutrino to 5s from the default 24 hrs. We correct this by
raising the ban duration to 48hrs. In the future in order to ignore
these peers persistently, we'll need to start to persist our ban list.
However that's a change for another time.
2019-04-05 16:07:54 -07:00
Wilmer Paulino
73791e15a1
Merge pull request #2830 from philippgille/feature/update-contribution-checklist
docs: update contribution checklist
2019-04-05 15:46:34 -07:00
Conner Fromknecht
25d2b1b537
Merge pull request #2885 from cfromknecht/stagger-initial-reconnect
server: stagger initial reconnects
2019-04-05 15:46:12 -07:00
Conner Fromknecht
caa0e2f0b8
Merge pull request #2879 from joostjager/outgoing-go-to-chain
htlcswitch: revert forwarding policy block delta requirements
2019-04-05 15:45:30 -07:00
Conner Fromknecht
209c2c6ead
Merge pull request #2841 from philippgille/patch-4
docs: fix broken TOC in Tor docs
2019-04-05 15:35:50 -07:00
Conner Fromknecht
a52f013161
Merge pull request #2893 from cfromknecht/no-want-zombie
channeldb/graph: filter zombie channels in FilterKnownChanIDs
2019-04-05 14:45:11 -07:00
Olaoluwa Osuntokun
46aa8503b2
Merge pull request #2892 from wpaulino/verify-chan-backup
rpc: modify VerifyChanBackup to take either a Single or Multi
2019-04-05 14:31:37 -07:00
Conner Fromknecht
a12b30f620
Merge pull request #2896 from halseth/make-fmt-routerrpc
[trivial] lnrpc/routerrpc: make fmt on router backend test
2019-04-05 14:03:31 -07:00
Conner Fromknecht
e91bacd1bc
channeldb/graph: filter zombie channels in FilterKnownChanIDs
This commit modifies FilterKnownChanIDs to skip edges that
we ourselves have deemed zombies. This prevents us from requesting
the updates from them, as this wastes bandwidth and cpu cycles.
2019-04-05 13:01:08 -07:00
Wilmer Paulino
b71bb9400a
rpc: modify VerifyChanBackup to take either a Single or Multi 2019-04-05 12:51:16 -07:00
Johan T. Halseth
f1677e7199
lnrpc/routerrpc: make fmt on router backend test 2019-04-05 17:25:49 +02:00
Joost Jager
206d93d856
htlcswitch/test: test zero value for outbound cltv reject delta 2019-04-05 11:36:18 +02:00
Joost Jager
1b2816006f
htlcswitch/test: align test invoice cltv expiry 2019-04-05 11:36:16 +02:00
Joost Jager
af7d0e5ff5
htlcswitch/test: convert TestChannelLinkSingleHopPayment to two hop network 2019-04-05 11:36:13 +02:00
Joost Jager
037913fd28
link: rewrite height comparisons without subtraction
Prevent the case where a uint32 wrap around could happen.
2019-04-05 11:36:10 +02:00
Joost Jager
ab4da0f53d
cnct: define separate broadcast delta for outgoing htlcs
This commits exposes the various parameters around going to chain and
accepting htlcs in a clear way.

In addition to this, it reverts those parameters to what they were
before the merge of commit d1076271456bdab1625ea6b52b93ca3e1bd9aed9.
2019-04-05 11:36:07 +02:00
Conner Fromknecht
cf80476e01
server: stagger initial reconnects
This commit adds optional jitter to our initial reconnection to our
persistent peers. Currently we will attempt reconnections to all peers
simultaneously, which results in large amount of contention as the
number of channels a node has grows.

We resolve this by adding a randomized delay between 0 and 30 seconds
for all persistent peers. This spreads out the load and contention to
resources such as the database, read/write pools, and memory
allocations. On my node, this allows to start up with about 80% of the
memory burst compared to the all-at-once approach.

This also has a second-order effect in better distributing messages sent
at constant intervals, such as pings. This reduces the concurrent jobs
submitted to the read and write pools at any given time, resulting in
better reuse of read/write buffers and fewer bursty allocation and
garbage collection cycles.
2019-04-05 02:31:58 -07:00
Conner Fromknecht
7a718a401e
Merge pull request #2890 from wpaulino/walletunlocker-rest-proxy-dest
lnd+rpcserver: refactor TLS configuration
2019-04-05 00:10:39 -07:00
Conner Fromknecht
12ec69a48b
Merge pull request #2883 from cfromknecht/chan-restore-child-harness
lnd_test: use child harness chan backup restore cases
2019-04-05 00:04:25 -07:00
Conner Fromknecht
4de7d0c561
Merge pull request #2889 from grunch/fix-verifychanbackup-help
Remove equal sign on verifychanbackup cmd argument
2019-04-04 15:49:22 -07:00
Francisco Calderón
4c1bd14c94
Remove equal sign on verifychanbackup cmd on multi_file argument description 2019-04-04 19:06:23 -03:00
Joost Jager
cf42719c45
lnd+rpcserver: refactor TLS configuration
This commit restructures the creation of various tls related object. It
also fixes a bug where wildcard IP addresses where only instantiated for
the main RPC server and not the WalletUnlocker service.
2019-04-04 14:18:18 -07:00
Conner Fromknecht
9e8fc90754
Merge pull request #2888 from grunch/add-space-on-exportchanbackup-description
Add a missing white space on exportchanbackup command description
2019-04-04 13:08:31 -07:00
Francisco Calderón
0e4f5c0e2f
Add a missing white space on exportchanbackup command description 2019-04-04 11:07:06 -03:00
Conner Fromknecht
8f5ddf39ca
lnd_test: use child harness chan backup restore cases
This prevents a panic during test failure due to a child test calling
FailNow on a parent test context. The sub tests now capture the
testing.T object provided in each closure as opposed to ignoring it and
using the parent context.
2019-04-04 02:36:32 -07:00
Conner Fromknecht
d75112ce8d
pool/worker: increase worker timeout to 90s
This commit increases the default worker timeout currently backing the
read and write pools. This allows the read and write pools to sustain
regular bursty traffic such as ping/pong without releasing their buffers
back to the underlying gc queue. In the future, jitter can be added to
our ping and/or gossip messages to reduce the concurrent usage of read
and write pools, which will make this change even more effective.
2019-04-04 01:51:47 -07:00
Olaoluwa Osuntokun
3a19afe46d
Merge pull request #2882 from wpaulino/sync-manager-stale-syncer
discovery: only replace stale active syncer if disconnected
2019-04-03 20:47:12 -07:00
Wilmer Paulino
00338c5ec2
discovery: properly handle SyncManager shutdown signal 2019-04-03 19:32:56 -07:00
Wilmer Paulino
46ceaf8cf6
discovery: only replace stale active syncer if disconnected
In this commit, we address a bug where we'd attempt to replace the
stale active syncer when it transitioned to a passive syncer. This
replacement logic is only intended to happen when the active syncer
disconnects, as rotateActiveSyncerCandidate chooses and queues its own
replacement.
2019-04-03 16:43:31 -07:00
Olaoluwa Osuntokun
2cc6687ff3
build: bump version to 0.6-beta 2019-04-03 15:48:17 -07:00
Olaoluwa Osuntokun
30f2b1ca01
Merge pull request #2740 from wpaulino/gossip-sync-manager
discovery: introduce gossiper syncManager subsystem
2019-04-03 15:46:12 -07:00
Wilmer Paulino
ca01695330
rpc: expose peer's GossipSyncer sync type 2019-04-03 15:44:47 -07:00
Wilmer Paulino
8b6a9bb5d3
discovery: make timestamp range check inclusive within FilterGossipMsgs
As required by the spec:

> SHOULD send all gossip messages whose timestamp is greater or equal to
first_timestamp, and less than first_timestamp plus timestamp_range.
2019-04-03 15:44:46 -07:00
Wilmer Paulino
70be812747
discovery+server: use new gossiper's SyncManager subsystem 2019-04-03 15:44:43 -07:00
Wilmer Paulino
80b84eef9c
config+peer: replace NoChanUpdates flag with NumGraphSyncPeers
In this commit, we replace the NoChanUpdates flag with a flag that
allows us to specify the number of peers we want to actively receive new
graph updates from. This will be required when integrating the new
gossiper SyncManager subsystem with the rest of lnd.
2019-04-03 15:43:51 -07:00
Wilmer Paulino
a188657b2f
discovery: introduce gossiper SyncManager subsystem
In this commit, we introduce a new subsystem for the gossiper: the
SyncManager. This subsystem is a major overhaul on the way the daemon
performs the graph query sync state machine with peers.

Along with this subsystem, we also introduce the concept of an active
syncer. An active syncer is simply a GossipSyncer currently operating
under an ActiveSync sync type. Before this commit, all GossipSyncer's
would act as active syncers, which means that we were receiving new
graph updates from all of them. This isn't necessary, as it greatly
increases bandwidth usage as the network grows. The SyncManager changes
this by requiring a specific number of active syncers. Once we reach
this specified number, any future peers will have a GossipSyncer with a
PassiveSync sync type.

It is responsible for three main things:

1. Choosing different peers randomly to receive graph updates from to
ensure we don't only receive them from the same set of peers.

2. Choosing different peers to force a historical sync with to ensure we
have as much of the public network as possible. The first syncer
registered with the manager will also attempt a historical sync.

3. Managing an in-order queue of active syncers where the next cannot be
started until the current one has completed its state machine to ensure
they don't overlap and request the same set of channels, which
significantly reduces bandwidth usage and addresses a number of issues.
2019-04-03 15:08:32 -07:00