Commit Graph

139 Commits

Author SHA1 Message Date
Olaoluwa Osuntokun
bbb34cebe0
routing: modify the TestSendPaymentRouteFailureFallback to clear missionControl between attempts
In order to maintain the original essence of the test, we need to clear
the state of missionControl with each attempt, essentially advancing
time between each payment attempt.
2017-10-16 19:07:40 -07:00
Olaoluwa Osuntokun
8ef829ed80
routing: modify SendPayment loop to be lazy, iterative, and use missionControl
In this commit we modify the SendPayment loop to optimize for
time-to-first-payment-success-or-failure. The prior logic would first
attempt to find at least 100 routes to the destination, then
iteratively prune them away as errors were encountered. In this commit,
we modify this approach to instead take a lazy approach: we first find
the current “best” path, attempt to send to that, and if an error
occurs we prune a section of the graph by reporting to missionControl,
then continue.

With this new approach, if the first known path has sufficient
capacity, and is available, then the payment speed is greatly improved
from the PoV of users. Additionally, we avoid the excessive computation
of crawling most of the graph in the k-shortest paths loop. With the
decay on missionControl, all routes will now feed information into the
central knowledge hung, allowing all payments to iteratively find out
the inactive portions of the payment graph.
2017-10-16 19:05:47 -07:00
Olaoluwa Osuntokun
276f2e467b
routing: end path finding on an additional set of critical-ish errors 2017-10-16 18:58:35 -07:00
Olaoluwa Osuntokun
e06177e55c
routing: introduce new missionControl system within ChannelRouter
This commit adds a new system within the ChannelRouter: missionControl.
The purpose of this system to is to act as a shared memory of sorts
between payment sending attempts, recording which edges/vertexes word
or didn’t work. Allowing execution attempts to pass on their iterative
knowledge of the graph to later attempts will reduce the number of
failures encountered, and generally lead to a better UX when sending
payments.

The current capabilities of missionControl are rather limited just to
introduce the new abstraction. Later follow up commits will also add
preferential treatment for reliable nodes, knowledge the impact that
target payments have on unbalancing the payment graph, etc.
2017-10-16 18:57:36 -07:00
Olaoluwa Osuntokun
ae6bde2d77
routing: avoid internal bolt db deadlock by reusing transaction in findPath
This commit fixes a bug that could lead to a deadlock inside bolt db
itself. In a recent commit we allowed a db transaction to be passed
directly into findPath, however, the initial call to graph.ForEachNode
instead passed a _nil_ transaction causing the method itself to create
a _new_ transaction, leading to a deadlock.

We fix this issue by instead re-using the transaction pointer.
2017-10-16 18:48:27 -07:00
Olaoluwa Osuntokun
b29a73a0dd
routing: don't prune our own channels during zombie channel collection
This commit is a precautionary commit which ensure that we don’t
attempt to prune our _own_ channels during zombie channel collection.
2017-10-16 18:14:01 -07:00
Olaoluwa Osuntokun
51b072c4b5
routing: return proper error if encounter non ForwardingError in SendPayment 2017-10-10 22:19:28 -07:00
Olaoluwa Osuntokun
646f79f566
routing: perform path finding inside a single DB transaction
This commit modifies the path finding logic such that all path finding
is done inside a _single_ database transaction. With this change, we
ensure that we don’t end up possibly creating hundreds of database
transactions slowing down the path finding and payment sending process
all together.
2017-10-10 22:19:27 -07:00
Olaoluwa Osuntokun
eb7b5b342e
routing: add a basic test to exercise route pruning in response to errors 2017-10-10 22:19:25 -07:00
Olaoluwa Osuntokun
ce7179a468
routing: add basic route pruning in response to HTLC onion errors
This commit adds basic route pruning in response to HTLC onion errors.
With this new change, the router will now prune routes in response to
HTLC errors, which will reduce the time to payment success, and also
avoid a bunch of unnecessary network traffic.

We now respond to two errors lnwire.FailTemporaryChannelFailure and
lnwire.FailUnknownNextPeer. In response to the first error, we’ll prune
all routes that contain the channel which was unable to be routed over.
In response to the second error we’ll prune all routes that contain the
node which couldn’t be found.
2017-10-10 22:19:25 -07:00
Olaoluwa Osuntokun
8d7f3943bb
routing: add two new methods to filter routes based on node/channel 2017-10-10 22:19:24 -07:00
Olaoluwa Osuntokun
f6ac31281b
routing: also include the source node in the nextHopMap index
In this commit we modify the newRoute function to also add the source
node to the nextHopMap index. With this addition the indexes will now
allow the router to react based on failures that occur during the
_first_ hop, meaning the channel directly attached to the source node.
2017-10-10 22:19:23 -07:00
Olaoluwa Osuntokun
70e114fa6f
routing: add additional indexes to the Route struct to allow for querying
This commit adds three new indexes to the Route struct. These indexes
allow a caller to check if a channel is in the route, check if a node
is in the route, query the next node after a target node, and query the
next channel after a target node. The combination of these new indexes
will allow the ChannelRouter to prune away routes from the available
set in response to any received errors.
2017-10-10 22:19:22 -07:00
Olaoluwa Osuntokun
7ba6b7fa09
routing: add a String() method to the vertex type 2017-10-10 22:19:22 -07:00
Olaoluwa Osuntokun
3b7855e449
routing: implement 2-week zombie channel pruning
This commit implements 2-week zombie channel pruning. This means that
every GraphPruneInterval (currently set to one hour), we’ll scan the
channel graph, marking any channels which haven’t had *both* edges
updated in 2 weeks as a “zombie”. During the second pass, all “zombie”
channel are removed from the channel graph all together.

Adding this functionality means we’ll ensure that we maintain a
“healthy” network view, which will cut down on the number of failed
HTLC routing attempts, and also reflect an active portion of the graph.
2017-10-04 20:46:09 -07:00
Olaoluwa Osuntokun
e81689057a
discovery+routing: remove DeleteEdge from ChannelGraphSource interface
This commit removes the recently added DeleteEdge method from the
ChannelGraphSource interface as it’s no longer needed.
2017-10-04 20:46:08 -07:00
Brandon
3907ae65c2 routing+discovery: implement 2-week network view pruning 2017-10-04 20:40:21 -07:00
Laura Cressman
156772d04a channeldb: use binary.Read/Write with bools in channel.go
Use binary.Read/Write in functions to serialize and deserialize
channel close summary and HTLC boolean data, as well as in
methods to put and fetch channel funding info. Remove lnd
implementations of readBool and writeBool as they are no
longer needed. Also fix a few minor typos.
2017-10-02 23:13:47 -07:00
Laura Cressman
29687f49eb routing: replace sort.Sort with sort.Slice in router.go
Use sort.Slice in FindRoutes function in routing/router.go, as part
of the move to use new language features. Remove sortableRoutes type
wrapper for slice of Routes since it is no longer needed to sort routes.
2017-10-02 23:13:47 -07:00
Laura Cressman
8822bb11ce routing: replace sort.Sort with sort.Slice in heap_test.go
Use sort.Slice in TestHeapOrdering function in routing/heap_test.go,
as part of the move to use new language features.
2017-10-02 23:13:47 -07:00
Olaoluwa Osuntokun
7eb0e56406
routing: modify TestSendPaymentRouteFailureFallback to use non-critical error
In this commit we modify the existing
TestSendPaymentRouteFailureFallback to use a non-critical error aside
from FailChannelDisabled. This is necessary as the behavior of the
current error handling can fail due to us sending in a nil error.
2017-10-02 22:14:14 -07:00
Olaoluwa Osuntokun
3ba70fe6ec
routing: add preliminary version of more intelligent payment error handling
This commit modifies the way we currently interpret errors when sending
payments via the SendToSwitch method. We split the errors into two
broad sections: critical errors which cause us to abandon the payment
dispatch all together, and errors which are transient meaning we should
continue trying to remainder of the returned routes.

Note that we haven’t yet properly implemented all the necessary
measures such as filtering edges that are detected as being temporarily
inactive, etc.

This change should correct erroneous behavior such as continuing to try
all available routes in the face of an invalid payment hash error and
the like.
2017-10-02 22:14:13 -07:00
Olaoluwa Osuntokun
486b464e1c
routing: move path caching into FindRoutes
This commit modifies the way we do path caching. Rather than only
caching within SendPayment, we now cache routes within FindRoutes. This
is more natural as SendPayment eventually calls FindRoute. As a result
of this commit, queries to FindRoute are now properly cached, speeding
up applications which are focused on graph visualization or querying
rather than sending payments.
2017-10-02 22:14:13 -07:00
John Griffith
1057a1a7c3 routing: handle onion errors in ChannelRouter 2017-10-02 22:13:05 -07:00
Jim Posen
d8a2ed27b8 routing/chainview: Fix data race in block disconnected callback. 2017-09-29 13:53:02 -07:00
Olaoluwa Osuntokun
0e626ce42c
routing: add a select+quit case when receiving error to ensure graceful shutdown 2017-09-25 20:55:09 -07:00
Olaoluwa Osuntokun
e5f3ee0fb6
chainntnfs+routing/chainview: reduce neutrino.WaitForMoreCFHeaders value
This commit reduces the neutrino.WaitForMoreCFHeaders parameter when
instantiating a neutrino instance as a lower value will allow the tests
to complete more quickly.
2017-09-13 16:46:11 +02:00
Olaoluwa Osuntokun
b07e7fb7cc
routing: hop-payload for last hop should be the absolute timeout, not delta
This commit fixes an oversight in the path finding code when converting
a path into a route. Currently, for the last hop, we’d emplace the
expiry delta of the last hop within the per-hop payload. This was left
over from a prior version of the specification.

To fix this, we’ll now emplace the _absolute_ final HTLC expiry with
the payload, such that, the final hop that verify that the HTLC has not
been tampered with in flight.
2017-09-12 21:27:47 +02:00
Olaoluwa Osuntokun
9f0efddc20
multi: switch from btcrpcclient to rpcclient 2017-08-24 18:54:24 -07:00
Olaoluwa Osuntokun
321cc28cd8
routing: in findPath skip edge if incoming edge isn't advertised 2017-08-22 00:54:15 -07:00
Olaoluwa Osuntokun
f5d221012d
routing: update ChannelGraphSource due to latest API changes 2017-08-22 00:53:36 -07:00
Olaoluwa Osuntokun
bc4ad34190
routing: don't re-validate a channel update's edge if it already exists
By avoiding re-validating the channel edge, we avoid wasted network
bandwidth and queries.
2017-08-22 00:53:33 -07:00
Olaoluwa Osuntokun
5301da790c
routing: fix path finding, bug use the proper policy during path finding
This commit fixes an lingering bug within the path finding logic of the
router. Previously we used the edge policy directly attached to the
outgoing channel of the node we were traversing to calculate the fees
and time lock information. This is incorrect, as we instead should be
using the policy of the *connecting* node as we’ll need to pay for
transit as they dictate.

To remedy this, we now grab the incoming+outgoing edges and use those
accordingly when building the initial path.
2017-08-22 00:53:15 -07:00
Olaoluwa Osuntokun
6467fdd829
routing: update path finding and notifications to use mSAT 2017-08-22 00:53:12 -07:00
Olaoluwa Osuntokun
5ef077e5c8
routing: cap number of yen's algorithm iterations at 100
This commit makes a precautionary change in order to ensure that the
upper bound on the number of iteration’s within our version of Yen’s
algorithm is fixed.
2017-08-15 19:56:41 -07:00
Olaoluwa Osuntokun
8c3441b30f
routing: update test to account for proper time locks 2017-08-02 21:07:35 -07:00
Olaoluwa Osuntokun
67f17b319a
routing: invalidate routing cache on each new block
This commit makes the routing cache invalidation a bit more aggressive.
We now invalidate the cache on each new block as the routes in the
cache are based on the current block height. Using the cached items may
cause our routes to fail due to them having time locks which have
already expired.
2017-08-02 21:07:06 -07:00
Olaoluwa Osuntokun
f61d977176
routing: obtain current height when creating a new route 2017-08-02 21:02:24 -07:00
Olaoluwa Osuntokun
d331ddd2f4
routing: when creating a route, base time locks off current height
This commit implements some missing functionality, namely before all
time locks were calculated off of a base height of 0 essentially.
That’s incorrect as all time locks within HTLC’s would then be already
expired. We remedy this requesting the latest height when creating a
route to ensure that our time locks are set properly.
2017-08-02 21:01:54 -07:00
Johan T. Halseth
39a59bbe6f routing: Require adding edge to node before adding node.
This commit introduces the requirement specified in BOLT#7,
where we ignore any node announcements for a specific node
if we yet haven't seen any channel announcements where this
node takes part. This is to prevent someone DoS-ing the
network with cheap node announcements. In the router this
is enforced by requiring a call to AddNode(node_id) to
be preceded by an AddEdge(edge_id) call, where node_id is
one of the nodes in edge_id.
2017-08-02 15:58:58 -07:00
Conner Fromknecht
14a06526b8 routing/notifs: order invariant testing of ntfn delivery (#238)
Modifies the test cases in `TestEdgeUpdateNotification` and
`TestNodeUpdateNotification` to check for the possibility of notifications
being delivered out of order.  This addresses some sporadic failures that
were observed when running the test suite. 

I looked through some of the open issues but didn't see any addressing this
issue in particular, but if someone could point me to any relevant issues
that would be much appreciated!

Issue
-----
Currently the test suite validates notifications received in the order they
are submitted. The check fails because the verification of each
notification is statically linked to the order in which they are delivered,
seen
[here](1be4d67ce4/routing/notifications_test.go (L403))
and
[here](1be4d67ce4/routing/notifications_test.go (L499))
in `routing/notifications_test.go`.  The notifications are typically
delivered in this order, but causes the test to fail otherwise.

Proposed Changes
-------------------
Construct an index that maps a public key to its corresponding edges and/or
nodes.  When a notification is received, use its identifying public key and
the index to look up the edge/node to use for validation. Entries are
removed from the index after they are verified to ensure that the same
entry is validated twice. The logic to dynamically handle the verification
of incoming notifications rests can be found here
[here](https://github.com/cfromknecht/lnd/blob/order-invariant-ntfns/routing/notifications_test.go#L420)
and
[here](https://github.com/cfromknecht/lnd/blob/order-invariant-ntfns/routing/notifications_test.go#L539).

Encountered Errors
--------------------
 * `TestEdgeUpdateNotification`: notifications_test.go:379: min HTLC of
   edge doesn't match: expected 16.7401473 BTC, got 19.4852751 BTC
 * `TestNodeUpdateNotification`: notifications_test.go:485: node identity
   keys don't match: expected
   027b139b2153ac5f3c83c2022e58b3219297d0fb3170739ee6391cddf2e06fe3e7, got
   03921deafb61ee13d18e9d96c3ecd9e572e59c8dbd0bb922b5b6ac609d10fe4ee4


Recreating Failing Behavior
---------------------------
The failures can be somewhat difficult to recreate, I was able to reproduce
them by running the unit tests repeatedly until they showed up.  I used the
following commands to bring them out of hiding:

```
./gotest.sh -i
go test -test.v ./routing && while [ $? -eq 0 ]; do go test -test.v ./routing; done
```

I was unable to recreate these errors, or any others in this package, after
making the proposed changes and leaving the script running continuously for
~30 minutes. Previously, I could consistently generate an error after ~20
seconds had elapsed on the latest commit in master at the time of writing:
78f6caf5d2e570fea0e5c05cc440cb7395a99c1d. Moar stability ftw!
2017-07-31 21:38:03 -07:00
Andrey Samokhvalov
2d378b3280 htlcswitch+router: add onion error obfuscation
Within the network, it's important that when an HTLC forwarding failure
occurs, the recipient is notified in a timely manner in order to ensure
that errors are graceful and not unknown. For that reason with
accordance to BOLT №4 onion failure obfuscation have been added.
2017-07-14 19:08:04 -07:00
Olaoluwa Osuntokun
9daa659bb3
routing/chainview: convert chainview integration tests to use sub-tests 2017-07-04 15:53:58 -07:00
Andrey Samokhvalov
8fa2b95c12 lnd: remove seelog logger
The btclog package has been changed to defining its own logging
interface (rather than seelog's) and provides a default implementation
for callers to use.

There are two primary advantages to the new logger implementation.

First, all log messages are created before the call returns.  Compared
to seelog, this prevents data races when mutable variables are logged.

Second, the new logger does not implement any kind of artifical rate
limiting (what seelog refers to as "adaptive logging").  Log messages
are outputted as soon as possible and the application will appear to
perform much better when watching standard output.

Because log rotation is not a feature of the btclog logging
implementation, it is handled by the main package by importing a file
rotation package that provides an io.Reader interface for creating
output to a rotating file output.  The rotator has been configured
with the same defaults that btcd previously used in the seelog config
(10MB file limits with maximum of 3 rolls) but now compresses newly
created roll files.  Due to the high compressibility of log text, the
compressed files typically reduce to around 15-30% of the original
10MB file.
2017-06-25 14:19:56 +01:00
Olaoluwa Osuntokun
9b9f419e51
routing: fix vet error, ensure wait group in topologyClient isn't copied 2017-06-25 13:48:52 +01:00
Olaoluwa Osuntokun
5c45d52ab6
routing: wait for topology clients to fully exit before closing ntfn chan
This commit fixes a send on closed channel panic by adding additional
synchronization when cancelling the notifications for a particular
topology client. We now ensure that all goroutines belonging to a
particular topology client exit fully before we close the notification
channel in order to avoid a panic.
2017-06-25 13:31:55 +01:00
Olaoluwa Osuntokun
1be4d67ce4
multi: run all test instances in parallel 2017-06-17 01:00:07 +02:00
Olaoluwa Osuntokun
2452f2ed82
routing: update test to assert correctness of per-hop payloads 2017-06-16 22:45:30 +02:00
Olaoluwa Osuntokun
fca51d6165
routing: use converted hop payloads for sphinx packet when creating onion 2017-06-16 22:43:00 +02:00
Olaoluwa Osuntokun
4d8bb21d9d
routing: add ToHopPayloads method to routing.Route
This commit adds a new method to the routing.Route struct:
ToHopPayloads. This function will converts a complete route into the
series of per-hop payloads that is to be encoded within each HTLC using
an opaque Sphinx packet.

We can now use this function when creating the sphinx packet to
properly encoded the hop payload for each hop in the route.
2017-06-16 22:37:47 +02:00