To decouple the autopilot heuristic from the constraints, we start by
abstracting them behind an interface to make them easier to mock. We
also rename them HeuristicConstraints->AgentConstraints to make it clear
that they are now constraints the agent must adhere to.
This commit fixes a subtle bug within the autopilot manager, that would
cause the active pilot to not be reset in case it wasn't started
successfully.
We also make sure the associated goroutines close over the started
pilot, and not the active pilot.
This commit makes the weightedChoice algorithm take a slice of weights
instead of a map of node scores. This let us avoid costly map allocation
and iteration.
In addition we make the chooseN algorithm keep track of the remaining
nodes by keeping a slice of weights through its entire run, similarly
avoiding costly map allocation and iteration.
In total this brings the runtime of the TestChooseNSample testcase down
from ~73s to ~3.6s.
This addition to the unit tests makes sure nodes that have no channels
in the graph are left out od the scored nodes, implicitly giving them a
score of 0.
This commit makes the autopilot agent use the new NodeScores heuristic
API to select channel candiates, instead of the Select API. The result
will be similar, but instead of selecting a set of nodes to open
channels to, we get a score based results which can later be used
together with other heuristics to choose nodes to open channels to.
This commit also makes the existing autopilot agent tests compatible
with the new NodeScores API.
This commit adds a new method NodeScores to the AttachementHeuristic
interface. Its intended use is to score a set of nodes according to
their preference as channel counterparties.
The PrefAttach heuristic gets a NodeScores method that will score the
ndoes according to their number of already existing channels, similar to
what is done already in Select.
This commit defines a new struct HeuristicConstraints that will be used
to keep track of the initial constraints the autopilot agent needs to
adhere to. This is currently done in the ConstrainedPrefAttachement
heuristic itself, but this lets us share these constraints and common
method netween several heuristics.
This commit ensures that the mock attachment
directives use unique keys, ensuring that they
aren't skipped due to already having pending
connection requests. The tests fail when
they're all the same since they collide
in the pendingConns map.
This commit modifies the autopilot agent to track
all pending connection requests, and forgo further
attempts if a connection is already present.
Previously, the agent would try and establish
hundreds of requests to a node, especially if the
connections were timing out and not returning.
This resulted in an OOM OMM when cranking up
maxchannels to 200, since there would be close
to 10k pending connections before the program was
terminated. The issue was compounded by periodic
batch timeouts, causing autopilot to try and
process thousands of triggers for failing
connections to the same peer.
With these fixes, autopilot will skip nodes that we
are trying to connect to during heuristic selection.
The CPU and memory utilization have been significantly
reduced as a result.
In this commit, we implement an optimization to the autopilot agent to
ensure that we don't spin and waste CPU when we either have a large
graph, or a high max channel target for the agent. Before this commit,
each time we went to read the state of a channel from disk, we would
decompress the EC Point each time. However, for the case of the instal
ChannlEdge struct to feed to the agent, we only actually need to obtain
the pubkey, and can save the potentially expensive point decompression
for each directional channel in the graph.
We do this to avoid a huge amount of goroutines piling up when autopilot
is trying to open many channels, as they will all block trying to send
the update on the stateUpdates channel. Now we instead send them on a
buffered channel, similar to what is done with the nodeUpdates.
We do this to avoid a huge amount of goroutines piling up on initial
graph sync, as they will all block trying to send the node update on the
stateUpdates channel. Now we instead make a new buffered channel
nodeUpdates, and just return immediately if there is already a signal in
the channel waiting to be processed.