This commit adds optional jitter to our initial reconnection to our
persistent peers. Currently we will attempt reconnections to all peers
simultaneously, which results in large amount of contention as the
number of channels a node has grows.
We resolve this by adding a randomized delay between 0 and 30 seconds
for all persistent peers. This spreads out the load and contention to
resources such as the database, read/write pools, and memory
allocations. On my node, this allows to start up with about 80% of the
memory burst compared to the all-at-once approach.
This also has a second-order effect in better distributing messages sent
at constant intervals, such as pings. This reduces the concurrent jobs
submitted to the read and write pools at any given time, resulting in
better reuse of read/write buffers and fewer bursty allocation and
garbage collection cycles.
In this commit, we convert the server's Start/Stop methods to use the
sync.Once. We do this in order to fix concurrency issues that would
allow certain queries to be sent to the server before it has actually
fully start up. Before this commit, we would set started to 1 at the
very top of the method, allowing certain queries to pass before the rest
of the daemon was had started up.
In order to fix this issue, we've converted the server to using a
sync.Once, and two new atomic variables for clients to query to see if
the server has fully started up, or is in the process of stopping.
In this commit, we modify the server to serve the role as the agent
which will carry out the SCB restoration protocol if the Init/Unlock
methods include a set of channels to be recovered.
This commit increase the expiry grace delta to a value above the
broadcast delta. This prevents htlcs from being accepted that would
immediately trigger a channel force close.
A correct delta is generated in server.go where there is access to
the broadcast delta and passed via the peer to the links.
Co-authored-by: Jim Posen <jim.posen@gmail.com>
Previously it was difficult to use the invoice registry in unit tests,
because it used zpay32 to decode the invoice. For that to succeed, a
valid signature is required on the payment request.
This commit injects the decode dependency on a different level so that
it is easier to mock.
Previously a function pointer was passed to chain arbitrator to avoid a
circular dependency. Now that the routetypes package exists, we can pass
the full invoice registry to chain arbitrator.
This is a preparation to be able to use other invoice registry methods
in contract resolvers.
This commit modified the condition to whether drop an existing
connection to a peer when a new connection to this peer is
established.
The previous algorithm used public keys comparison for this decision
which determines that between every two nodes only one of them will
ever drop the connection in such cases.
The problematic case is when a node disconnects and reconnects in a
short interval which is the case of mobile devices.
In such case it takes as much as the "timeout" configured value for
the remote node to detect the "disconnection" (and try to reconnect
if this connection is persistent). In the case this node is also the
one that has the "smaller" public key the reconnect attempts of the
other node will be rejected causing it impossible to fast reconnect.
The solution is to only drop the connection if if we already have a
connected peer that is of the opposite direction from the this new
connection. By doing so the "initiator" will be enabled to replace
the connection and recconnect immediately.
In this commit, we also allow channel updates for our channels to be
sent reliably to our channel counterparty. This is especially crucial
for private channels, since they're not announced, in order to ensure
each party can receive funds from the other side.
Exposes the three parameters that dictate
the behavior of the channel status manager:
* --chan-enable-timeout
* --chan-disable-timeout
* --chan-status-sample-interval
This commit introduces the channel notifier which is a central source
of active, inactive, and closed channel events.
This notifier was originally intended to be used by the `SubscribeChannels`
streaming RPC call, but can be used by any subsystem that needs to be
notified on a channel becoming active, inactive or closed.
It may also be extended in the future to support other types of notifications.
In this commit, we modify the way we attempt to locate the our channel
peer to establish a persistent connection on start up. Before this
commit, we would use the channel policy pointing to the peer to locate
the node as it's embedded in the struct. However, at times it's
currently possible for the channel policy to not be found in the
database if the remote nodes announces before we finalize the process on
our end. This can at times lead to a panic as the pointer isn't checked
before attempting to access it.
To remedy this, we now instead use the channel info which will _always_
be there if the channel is known.
This commit is a step to split the lnwallet package. It puts the Input
interface and implementations in a separate package along with all their
dependencies from lnwallet.
In this commit, we extend the htlcSuccessResolver to settle the invoice,
if any, of the corresponding on-chain HTLC sweep. This ensures that the
invoice state is consistent as when claiming the HTLC "off-chain".
This method is called to convert an EdgePolicy to a ChannelUpdate. We
make sure to carry over the max_htlc value.
Co-authored-by: Johan T. Halseth <johanth@gmail.com>
In this commit:
* we partition lnwire.ChanUpdateFlag into two (ChanUpdateChanFlags and
ChanUpdateMsgFlags), from a uint16 to a pair of uint8's
* we rename the ChannelUpdate.Flags to ChannelFlags and add an
additional MessageFlags field, which will be used to indicate the
presence of the optional field HtlcMaximumMsat within the ChannelUpdate.
* we partition ChannelEdgePolicy.Flags into message and channel flags.
This change corresponds to the partitioning of the ChannelUpdate's Flags
field into MessageFlags and ChannelFlags.
Co-authored-by: Johan T. Halseth <johanth@gmail.com>
In this commit, we modify our default local feature bits to require the
Data Loss Protection (DLP) feature to be active. Once full Static
Channel Backups are implemented, if we connect to a peer that doesn't
follow the DLP protocol, then the SCBs are useless, as we may not be
able to recover funds. As a result, in prep for full SCB deployment,
we'll now ensure that any peer we connect to, knows of the DLP bit. This
could be a bit more relaxed and allow _connections_ to non-DLP peers,
but reject channel requests to/from them. However, this implementation is
much simpler.
Previously, nursery generated and published its own sweep txes. It
stored the sweep tx in nursery_store to prevent a new tx with a new
sweep address from being generated on restart.
In this commit, sweep generation and publication is removed from nursery
and delegated to the sweeper. Also the confirmation notification is
received from the sweeper.
In this commit, we remove the per channel `sigPool` within the
`lnwallet.LightningChannel` struct. With this change, we ensure that as
the number of channels grows, the number of gouroutines idling in the
sigPool stays constant. It's the case that currently on the daemon, most
channels are likely inactive, with only a hand full actually
consistently carrying out channel updates. As a result, this change
should reduce the amount of idle CPU usage, as we have less active
goroutines in select loops.
In order to make this change, the `SigPool` itself has been publicly
exported such that outside callers can make a `SigPool` and pass it into
newly created channels. Since the sig pool now lives outside the
channel, we were also able to do away with the Stop() method on the
channel all together.
Finally, the server is the sub-system that is currently responsible for
managing the `SigPool` within lnd.
In this commit, we update the genNodeAnnouncement method to also write
an updated version of the node announcment to disk. Before this commit,
we would update the in memory version, but then never write the new
version to disk. As a result, when connecting to new peers, we would
never propagate the new version of this announcement to the network.
In this commit, we modify the newly introduced UtxoSweeper.CreateSweepTx
to accept the confirmation target as a param of the method rather than a
struct level variable. We do this as this allows each caller to decide
at sweep time, what the fee rate should be, rather than using a global
value that is meant to work in all scenarios. For example, anytime
we're sweeping an output with a CLTV lock that's has a dependant
transaction we need to sweep/cancel, we may require a higher fee rate
than a regular force close with a CSV output.
In this commit, we also check ErrEdgeNotFound when attempting to send an
active/inactive channel update for a channel to the network. We do this
as it's possible that a channel has confirmed, but it still does not
meet the required number of confirmations to be publicly announced.
In this commit IncubateOutputs are given an extra parameter
broadcastHeight, which is passed from the server and used when called
registerPresschoolConf.
Earlier the utxonursery used the bestHeight as height hint in this case,
which would be wrong in the cases where we got outputs for incubation
which was confirmed below the best height.
over Tor
In this commit, we fix a small bug where we would attempt to start the
Tor controller even if we were not requested to automatically create and
onion service in order to listen for inbound connections over Tor.
In this commit, we restrict the persistent connection logic on startup
to only attempt to establish connections to Tor addresses if Tor
outbound support is enabled. Otherwise, we'll continually attempt to
reach the address even though we never will.
In this commit, we remove signaling for initial routing
dumps, which create unnecessary log spam, bandwidth, and
CPU. Now that gossip syncing is in full force, we will
instead opt to use the more efficient querying/set
reconciliation. Other nodes may still request initial
gossip sync from us, and we will respond.
This commit modifies the connection peer backoff
logic such that it will always backoff for "unstable"
peers. Unstable in this context is determined by
connections whose duration is shorter than 10
minutes. If a disconnect happens with a peer
whose connection lasts longer than 10 minutes,
we will scale back our stored backoff for that peer.
This resolves an issue that would result in a tight
connection loop with remote peers. This stemmed
from the connection duration being very short,
and always driving the backoff to the default
backoff of 1 second. Short connections like
this are now caught by the stable connection
threshold.
This also modifies the computation on the
backoff relaxation to subtract the connection
duration after applying randomized exponential
backoff, which offers better stability when
the connection duration and backoff are roughly
equal.
In this commit, we avoid logging an error when the links associated with
a peer are not found within its termination watcher. We do this to
prevent a benign log message as the links have already been removed from
the switch.
This commit fixes a bug that would cause us to fetch our peer's
ChannelUpdate in some cases, where we really wanted to fetch our own.
The reason this happened was that we passed the peer's pubkey to
fetchLastChanUpdate, making us match on their policy. This would lead to
ChannelUpdates being sent during routing which would have no effect on
the attempted path.
We fix this by always use our own pubkey in fetchLastChanUpdate, and
also uses the common methods within the server to be able to extract the
update even when only one policy is known.
This commit adds a goroutine watchChannelStatus to the server, which
will query the switch for the status of all open channels every
InactiveChanTimeout / 4. If a channel's status has remained unchanged
during the last InactiveChanTimeout it'll send out a ChannelUpdate
setting the disabled bit accordingly.
ProcessLocalAnnouncement will attempt to call UpdateEdge with the new
policy. If we call it manually before handing it to the gossiper, that
call will fail with "Outdated" and the announcement won't propagate.
This commit adds asynchronous starting of peers,
in order to avoid potential DOS vectors. Currently,
we block with the server's mutex while peers exchange
Init messages and perform other setup. Thus, a remote
peer that does not reply with an init message will
cause server to block for 15s per attempt.
We also modify the startup behavior to spawn
peerTerminationWatchers before starting the
peer itself, ensuring that a peer is properly
cleaned up if the initialization fails. Currently,
failing to start a peer does not execute the bulk
of the teardown logic, since it is not spawned
until after a successful Start occurs.
In this commit, we fix a small bug where we would increase epochErrors
by one even if connections were successfully established. Due to this,
we would stay stuck inside of the peer bootstrapper loop without
requerying for new peers.
In this commit, we move the initialization of the server into the
funding manager itself. We do this as it's no longer the case that _any_
RPC needs to access the funding manager. In the past, this was the
only reason that the funding manager was instantiated outside of the
server: to be able to respond to queries _before_ the server was
started.
This change also fixes a bug as atm, the funding manager will try to
register for notifications _before_ the ChainNotifier itself has fully
started.
In this commit, we modify the existing message sending functionality
within the fundingmanager. Due to each mesage send requiring to hold the
server's lock to retrieve the peer, we might run into a case where the
lock is held for a larger than usual amount of time and would therefore
block on sending the message within the fundingmanager. We remedy this
by taking a similar approach to some recent changes within the gossiper.
We now keep track of each peer within the internal fundingmanager
messages and send messages directly to them.
In this commit, we extend the server's functionality to prune link nodes
on startup. Since we currently only decide whether to prune a link node
from the database based on a channel close, it's possible that we have
link nodes lingering from before this functionality was added on.
In this commit, we update all the lncfg methods used to properly pass in
a new resolver. This is required in order to ensure that we don't leak
our DNS queries if Tor mode is active.
In this commit, we move the block height dependency from the links in
the switch to the switch itself. This is possible due to a recent change
on the links no longer depending on the block height to update their
commitment fees.
We'll now only have the switch be alerted of new blocks coming in and
links will retrieve the height from it atomically.
In this commit, we address an existing issue with regards to the inital
peer bootstrapping stage. At times, the bootstrappers can be unreliable
by providing addresses for peers that no longer exist/are currently
offline. This would lead to nodes quickly entering an exponential
backoff method used to maintain a minimum target of peers without first
achieving said target.
We address this by separating the peer bootstrapper into two stages: the
initial peer bootstrapping and maintaining a target set of nodes to
maintain an up-to-date view of the network. The initial peer
bootstrapping stage has been made aggressive in order to provide such
view of the network as quickly as possible. Once done, we continue on
with the existing exponential backoff method responsible for maintaining
a target set of nodes.
traversal
In this commit, we allow our node to automatically advertise its
connection's external IPs on the ports it is currently listening on in
order to accept inbound connections. This is only done when specifying
a NAT traversal technique when starting the daemon.
We also include a handy method that watches for dynamic IP changes in
the background. If a new IP is detected, we'll craft a new node
announcement using the new IP and broadcast it to the network.
In this commit, we finish the fix for the inbound/outbound peer bool in
the server. The prior commit forgot to also flip the inbound/output maps
in Inbound/Outbound peer connected. As a result, the checks were
incorrect and could cause lnd to refuse to accept any more inbound
connections in the case of a concurrent connection attempt.