Since we will use peer flap rate to determine how we rate limit, we
store this value on disk per peer per channel. This allows us to
restart with memory of our peers past behaviour, so we don't give badly
behaving peers have a fresh start on restart. Last flap timestamp is
stored with our flap count so that we can degrade this all time flap
count over time for peers that have not recently flapped.
When dealing with online events, we actually need to track our events
by peer, not by channel. All we need to track channels is to have a
set of online events for a peer which at least contain those events.
This change refactors chanfitness to track by peer.
We currently query the store for uptime and lifespan individually. As
we add more fields, we will need to add more queries with this design.
This change combines requests into a single channel infor request so
that we do not need to add unnecessary boilerplate going forward.
As we add more elements to the chanfitness subsystem, we will require
more complex testing. The current tests are built around the inability
to mock subscriptions, which is remedied by addition of our own mock.
This context allows us to run the full store in a test, rather than
having to manually spin up the main goroutine. Mocking our subscriptions
is required so that we can block our subscribe updates on consumption,
using the real package provides us with no guarantee that the client
receives the update before shutdown, which produces test flakes.
This change also makes a move towards separating out the testing of our
event store from testing the underlying event logs to prepare for
further refactoring.
Original PR was written with 4 spaces instead of 8, do a once off fix
here rather than fixing bit-by bit in the subsequent commits and
cluttering them for review.
This commit adds an initial peer online event for channels
that have peers that are online when they are created. This
addresses a race between the peer coming online and an
existing channel being added to the event store; if the
peer comes online first, the existing channel will not have
its initial online event added.
This commit addresses a bug in the channel event store
where the opened time of a channel event log was not
set for peers that were offline on startup.
Previously, opened time was set to the time of the first
event in the event log. This worked for online peers,
because the eventlog was created with an initial online
event. However, offline peers had no inital event so had
no open time set.
This commit simplifies the creation of an event log by
removing the initial event and setting open time for all
event logs. This has the effect of potentially introducing
a gap between opened time for a log and the first peer
online event for peers with channels that exist at startup
if a peer takes time to reconnect. However, the cost of this
is less than the benefit of reducing the bug-prone custom
code path that was previously in place.
In this commit, the channelEventStore in the channel
fitness subsystem is changed to identify channels
by their outpoint rather than short channel id. This
change is made made becuase outpoints are the preferred
way to expose references over rpc, and easier to perform
queries within lnd.
This commit adds a channel event store to the channel fitness
package which is used to manage tracking of a node's channels.
It adds tracking for channel open/closed and peer online/offline
events for all channels that a node has open.
Events are consumed from channelNotifier and peerNotifier event
subscriptions. If either of these subscriptions is cancelled,
channel scoring stops, because both subscriptions are expected
to run until node shutdown.
Two functions are exposed to allow external callers to get uptime
information about a channel. GetLifespan returns the period over
which the channel has been monitored. GetUptime returns the channel's
uptime over a specified period. Callers can use these functions to
get the channel's remote peer uptime over its entire lifetime, or
a subset of that period.