Commit Graph

47 Commits

Author SHA1 Message Date
yyforyongyu
cd35981569
routing: refactor update payment state tests
This commit refactors the resumePayment to extract some logics back to
paymentState so that the code is more testable. It also adds unit tests
for paymentState, and breaks the original MPPayment tests into independent tests
so that it's easier to maintain and debug. All the new tests are built
using mock so that the control flow is eaiser to setup and change.
2021-06-23 20:35:29 +08:00
yyforyongyu
e79e46ed21
routing: add mock structs for testing
This commit uses the package mock to create new mock structs, replacing
the old ones for better control when writing tests.
2021-06-23 20:35:29 +08:00
yyforyongyu
289d97fbfb
routing: rename mock structs to make them obsolete
This commit renames the mock structs by appending Old in their names. In
doing so the old tests stay unchanged and new mock structs can be added
in the following commit.
2021-06-23 20:35:28 +08:00
yyforyongyu
f31001e103
routing: make shardHandler aware of payment session
This commit adds payment session to shardHandler to enable private edge
policies being updated in shardHandler. The relevant interface and mock
are updated. From now on, upon seeing a ChannelUpdate message,
shardHandler will first try to find the target policy in additionalEdges
and update it. If nothing found, it will then check the database for
edge policy to update.
2021-06-23 18:13:04 +08:00
Johan T. Halseth
7795353e9f
channeldb: return full payment for inflight payments
We might as well return all info, and we'll need the individual HTLCs
in later commits.
2021-04-27 08:27:32 +02:00
carla
58d95be4dd
multi: change RegisterAttempt error checking order
Move our more generic terminal check forward so that we only
need to handle a single class of expected errors. This change
is mirrored in our mock, and our reproducing tests are updated
to assert that this move catches both classes of errors we get.
2021-04-23 08:39:45 +02:00
carla
12136a97a9
routing/test: add test for stuck payment with in-flight htlcs
Add an additional stuck-payment case, where our payment gets
a terminal error while it has other htlcs in-flight, and a
shard fails with ErrTerminalPayment. This payment also falls in
our class of expected errors, but is not currently handled. The
mock is updated accordingly, using the same ordering as in our
real RegisterAttempt implementation.
2021-04-23 08:39:45 +02:00
carla
80451afe48
routing/test: add test to demonstrate stuck payment, single shard
This commit adds a test which demonstrates that payments can
get stuck if we receive a payment failure while we're pathfinding
for another shard, then try to dispatch a shard after we've
recorded a permanent failure. It also updates our mock to
only consider payments with no in-flight htlcs as in-flight,
to more closely represent our actual RegisterAttempt.
2021-04-23 08:39:44 +02:00
carla
125980afb7
routing/test: block on pathfinding in tests
This commit adds a step to our payment lifecycle test to add
control over when we find a path for our payment, This is
required for testing race conditions around pathfinding
completing and payment failures being reported.
2021-04-23 08:39:43 +02:00
carla
e0c52e4473
routing/test: close payment result channel on shutdown, mimicking switch
This commit updates our mock to more closely follow the behavior of the
switch for mocked calls to GetPaymentResult. As it stands, our tests
send a test-created error from the switch when we want to mock shutdown.
In reality, the switch will close its result channel, so we update this
test to follow that behavior. This matters for the commit that follows,
because we start checking the error our payments return. If we have an
error from the switch, our tests will fail with an error that we do
not encounter in practice.
2021-04-23 08:39:40 +02:00
carla
9a78e9da73
routing/tests: move lock acquisition after state driving channels
In our payment lifecycle tests, we have two goroutines that
compete for the lock in our mock control tower: the resumePayment
loop which tries to call RegisterAttempt, and the collectResult
handler which is launched in a goroutine by collectResultAsync
and is responsible for various settle/fail calls.

The order that the lock is acquired by these goroutines is
arbitrary, and can lead to flakes in our tests if the step
that we do not intend to execute first gets the lock (eg,
we want to fail a payment in collectResult, but RegisterAttempt
gets there first). This commit moves contention for this lock
after our mock's various "state driving" channels, so that the
lock will be acquired in the order that the test intends it.
2021-04-23 08:39:39 +02:00
carla
64dad77e2e
multi: add get and set mission control to routerrpc 2021-01-19 10:57:15 +02:00
Johan T. Halseth
90a59fe70f
router: call CleanStore on startup 2020-10-06 10:46:04 +02:00
Johan T. Halseth
d1a118388d
payment_control: remove unused Attempts field 2020-10-06 10:41:25 +02:00
Joost Jager
cc37485432
routing: return full htlc attempt from shard handler 2020-05-12 19:56:50 +02:00
Joost Jager
af4abe7d58
routing+routerrpc: notify full payment structures
This commit fixes the inconsistency between the payment state as
reported by routerrpc.SendPayment/routerrpc.TrackPayment and the main
rpc ListPayments call.

In addition to that, payment state changes are now sent out for every
state change. This opens the door to user interfaces giving more
feedback to the user about the payment process. This is especially
interesting for multi-part payments.
2020-04-08 09:26:33 +02:00
Johan T. Halseth
5e72a4b77c
routing: exit on unexpected RequestRoute error
We whitelist a set of "expected" errors that can be returned from
RequestRoute, by converting them into a new type noRouteError. For any
other error returned by RequestRoute, we'll now exit immediately.
2020-04-02 19:31:24 +02:00
Johan T. Halseth
95c5a123c8
routing/router_test: add TestSendToRouteMultiShardSend 2020-04-02 19:31:23 +02:00
Johan T. Halseth
0fd71cd596
routing/payment_lifecycle_test: add step for terminal failure
And modify the MissionControl mock to return a non-nil failure reason in
this case.
2020-04-02 19:29:15 +02:00
Johan T. Halseth
2e63b518b7
routing/payment_lifecycle_test+mock: set up listener for FailAttempt
Also rename Success to SettleAttempt in the tests.
2020-04-02 19:29:15 +02:00
Johan T. Halseth
70202be580
channeldb: make database logic MPP compatible
This commit redefines how the control tower handles shard and payment
level settles and failures. We now consider the payment in flight as
long it has active shards, or it has no active shards but has not
reached a terminal condition (settle of one of the shards, or a payment
level failure has been encountered).

We also make it possible to settle/fail shards regardless of the payment
level status (since we must allow late shards recording their status
even though we have already settled/failed the payment).

Finally, we make it possible to Fail the payment when it is already
failed. This is to allow multiple concurrent shards that reach terminal
errors to mark the payment failed, without havinng to synchronize.
2020-04-02 19:29:14 +02:00
Johan T. Halseth
e1f4d89ad9
routing: add FetchPayment method to ControlTower
This method is used to fetch a payment and all HTLC attempt that have
been made for that payment. It will both be used to resume inflight
attempts, and to fetch the final outcome of previous attempts.

We also update the the mock control tower to mimic the real control
tower, by letting it track multiple HTLC attempts for a given payment
hash, laying the groundwork for later enabling it for MPP.
2020-04-02 10:24:34 +02:00
Johan T. Halseth
00903ef9f5
routing/payment_session: make RequestRoute take max amt, fee limit and
active shards

In preparation for doing pathfinding for routes sending a value less
than the total payment amount, we let the payment session take the max
amount to send and the fee limit as arguments to RequestRoute.
2020-04-02 10:24:33 +02:00
Johan T. Halseth
c2301c14b2
routing/payment_session: make NewPaymentSession take payment directly
This commit moves supplying of the information in the LightningPayment
to the initialization of the paymentSession, away from every call to
RequestRoute.

Instead the paymentSession will store this information internally, as it
doesn't change between payment attempts.

This is done to rid the RequestRoute call of the LightingPayment
argument, as for SendToRoute calls, it is not needed to supply the next
route.
2020-04-02 10:24:33 +02:00
Joost Jager
48c0e42c26
channeldb+routing: store all payment htlcs
This commit converts the database structure of a payment so that it can
not just store the last htlc attempt, but all attempts that have been
made. This is a preparation for mpp sending.

In addition to that, we now also persist the fail time of an htlc. In a
later commit, the full failure reason will be added as well.

A key change is made to the control tower interface. Previously the
control tower wasn't aware of individual htlc outcomes. The payment
remained in-flight with the latest attempt recorded, but an outcome was
only set when the payment finished. With this commit, the outcome of
every htlc is expected by the control tower and recorded in the
database.

Co-authored-by: Johan T. Halseth <johanth@gmail.com>
2020-03-09 18:31:33 +01:00
Johan T. Halseth
bee2380441
channeldb: rename PaymentAttemptInfo to HTLCAttemptInfo
To better distinguish payments from HTLCs, we rename the attempt info
struct to HTLCAttemptInfo. We also embed it into the HTLCAttempt struct,
to avoid having to duplicate this information.

The paymentID term is renamed to attemptID.
2020-03-09 11:43:26 +01:00
carla
b5a2d75465
htlcswitch+routing: type check on ClearTextError
Update the type check used for checking local payment
failures to check on the ClearTextError interface rather
than on the ForwardingError type. This change prepares
for splitting payment errors up into Link and Forwarding
errors.
2020-01-14 15:07:42 +02:00
Joost Jager
ff0c5a0d5e
routing: process successes in mission control
This commit modifies paymentLifecycle so that it not only feeds
failures into mission control, but successes as well.
This allows for more accurate probability estimates. Previously,
the success probability for a successful pair and a pair with
no history was equal. There was no force that pushed towards
previously successful routes.
2019-08-23 09:15:41 +02:00
Joost Jager
e7af6a077a
routing: convert to nillable failure reason
This commit converts several functions from returning a bool and a
failure reason to a nillable failure reason as return parameter. This
will take away confusion about the interpretation of the two separate
values.
2019-08-17 10:23:57 +02:00
Joost Jager
f1769c8c8c
routing: convert to node pair based
Previously mission control tracked failures on a per node, per channel basis.
This commit changes this to tracking on the level of directed node pairs. The goal
of moving to this coarser-grained level is to reduce the number of required
payment attempts without compromising payment reliability.
2019-08-13 19:21:37 +02:00
Joost Jager
2b4debf42b
routing/test: remove unused methods from mock 2019-08-13 18:45:02 +02:00
Joost Jager
7e7b620355
routing: persist mission control data 2019-07-31 08:44:00 +02:00
Joost Jager
b4a7665bae
routing: provide payment id to mission control 2019-07-29 09:38:32 +02:00
Joost Jager
334b6a3bfe
routing: move unreadable failure handling into mission control 2019-07-29 09:38:30 +02:00
Joost Jager
934ea8e78d
routing: move failure interpretation into mission control 2019-07-29 09:38:26 +02:00
Joost Jager
dc13da5abb
routing: move second chance logic into mission control
If nodes return a channel policy related failure, they may get a second
chance. Our graph may not be up to date. Previously this logic was
contained in the payment session.

This commit moves that into global mission control and thereby removes
the last mission control state that was kept on the payment level.

Because mission control is not aware of the relation between payment
attempts and payments, the second chance logic is no longer based
tracking second chances given per payment.

Instead a time based approach is used. If a node reports a policy
failure that prevents forwarding to its peer, it will get a second
chance. But it will get it only if the previous second chance was
long enough ago.

Also those second chances are no longer dependent on whether an
associated channel update is valid. It will get the second chance
regardless, to prevent creating a dependency between mission control and
the graph. This would interfer with (future) replay of history, because
the graph may not be the same anymore at that point.
2019-07-13 22:38:23 +02:00
Joost Jager
37e2751695
routing+routerrpc: isolate payment session source from mission control 2019-07-13 22:38:19 +02:00
Johan T. Halseth
2dea790b55
multi: make GetPaymentResult take payment hash
Used for logging in the switch, and when we remove the pending payments,
only the router will have the hash stored across restarts.
2019-06-07 16:53:32 +02:00
Joost Jager
eb2647e8fc
channeldb: add subscription to control tower
Allows other sub-systems to subscribe to payment success and fail
events.
2019-06-05 12:41:49 +02:00
Joost Jager
87d3207baf
channeldb+routing: move control tower interface to routing
This commit creates an empty shall for control tower in the routing
package. It is a preparation for adding event notification.
2019-06-05 12:41:47 +02:00
Joost Jager
7133f37bb8
routing: global probability based mission control
Previously every payment had its own local mission control state which
was in effect only for that payment. In this commit most of the local
state is removed and payments all tap into the global mission control
probability estimator.

Furthermore the decay time of pruned edges and nodes is extended, so
that observations about the network can better benefit future payment
processes.

Last, the probability function is transformed from a binary output to a
gradual curve, allowing for a better trade off between candidate routes.
2019-06-04 10:00:25 +02:00
Johan T. Halseth
1b788904f0
channeldb+router: record payment failure reason in db 2019-05-27 20:18:59 +02:00
Johan T. Halseth
60e2367973
routing/router_test: add TestRouterPaymentStateMachine
TestRouterPaymentStateMachine tests that the router interacts as
expected with the ControlTower during a payment lifecycle, such that it
payment attempts are not sent twice to the switch, and results are
handled after a restart.
2019-05-27 20:18:59 +02:00
Johan T. Halseth
cd02c22977
htlcswitch+router: move deobfuscator creation to GetPaymentResult call
In this commit we move handing the deobfuscator from the router to the
switch from when the payment is initiated, to when the result is
queried.

We do this because only the router can recreate the deobfuscator after a
restart, and we are preparing for being able to handle results across
restarts.

Since the deobfuscator cannot be nil anymore, we can also get rid of
that special case.
2019-05-16 23:56:12 +02:00
Johan T. Halseth
ec087a9f73
htlcswitch+router: define PaymentResult, GetPaymentResult
This lets us distinguish an critical error from a actual payment result
(success or failure). This is important since we know that we can only
attempt another payment when a final result from the previous payment
attempt is received.
2019-05-16 23:56:12 +02:00
Johan T. Halseth
c9e8ff6a34
switch+router+server: move NextPaymentID to router
This commit moves the responsibility of generating a unique payment ID
from the switch to the router. This will make it easier for the router
to keep track of which HTLCs were successfully forwarded onto the
network, as it can query the switch for existing HTLCs as long as the
paymentIDs are kept.

The router is expected to maintain a map from paymentHash->paymentID,
such that they can be replayed on restart. This also lets the router
check the status of a sent payment after a restart, by querying the
switch for the paymentID in question.
2019-05-16 23:56:06 +02:00
Johan T. Halseth
f1cb54f943
routing/router: define PaymentAttemptDispatcher interface
The switch satisfies this interface, and makes it easy to mock the send
method from the router.
2019-05-16 23:53:08 +02:00