Modifies syncer.replyChanRangeQuery method to use the LastBlockHeight
method on the query. LastBlockHeight safely calculates the ending
block height and prevents an overflow of start_block + num_blocks.
Prior to this change, query messages that had a start_block +
num_blocks that overflows uint32_max would return zero results in the
reply message.
Tests are added to fix the bug and ensure proper start and end values
are supplied to the channel graph filter.
This reworks the locking behavior of the Gossiper so that a race
condition on channel updates and block notifications doesn't cause any
loss of messages.
This fixes an issue that manifested mostly as flakes on itests during
WaitForNetworkChannelOpen calls.
The previous behavior allowed ChannelUpdates to be missed if they
happened concurrently to block notifications. The
processNetworkAnnoucement call would check for the current block height,
then lock the gossiper and add the msg to the prematureAnnoucements
list. New blocks would trigger an update to the current block height
then a lock and check of the aforementioned list.
However, specially during itests it could happen that the missing lock
before checking the height could case a race condition if the following
sequence of events happened:
- A new ChannelUpdate message was received and started processing on a
separate goroutine
- The isPremature() call was made and verified that the ChannelUpdate
was in fact premature
- The goroutine was scheduled out
- A new block started processing in the gossiper. It updated the block
height, asked and was granted the lock for the gossiper and verified
there was zero premature announcements. The lock was released.
- The goroutine processing the ChannelUpdate asked for the gossiper lock
and was granted it. It added the ChannelUpdate in the
prematureAnnoucements list. This can never be processed now.
The way to fix this behavior is to ensure that both isPremature checks
done inside processNetworkAnnoucement and best block updates are made
inside the same critical section (i.e. while holding the same lock) so
that they can't both check and update the prematureAnnoucements list
concurrently.
The linter complains about not checking the return value from
WipeChannel in certain places. Instead of checking we simply remove the
returned error because the in-memory modifications cannot fail.
If the provided ChainHash in a QueryChannelRange message does not match
that of our current chain, then we should send a blank response, rather
than reply with channels for the wrong chain.
We move from our legacy way of interpreting ReplyChannelRange messages
which was incorrect. Previously, we'd rely on the Complete field of the
ReplyChannelRange message to determine when our peer had sent all of
their replies. Now, we properly adhere to the specification by
interpreting the block ranges of these messages as intended.
Due to the large number of nodes deployed with the previous method, we
still maintain and detect when we are communicating with them, such that
we are still able to sync with them for backwards compatibility.
It's not possible to send another reply once all replies have been sent
without another request. The purpose of the check is also done within
another test, TestGossipSyncerReplyChanRangeQueryNoNewChans, so it can
be removed from here.
In order to properly adhere to the spec, when handling a
QueryChannelRange message, we must reply with a series of
ReplyChannelRange messages, that when consumed together cover the
entirety of the block range requested.
In this commit we fix in a bug in `lnd` that could cause other
implementations which implement a strict version of the spec to
disconnect when trying to sync their channel graph using the gossip
query feature. Before this commit, we would embed the request to a
`QueryChannelRange` in the response, causing some clients to reject the
response as the `FirstBlockHeight` and `NumBlocks` field would be
identical for each chunk of the response.
In order to remedy this, we now properly set these two fields with each
returned chunk. Note that even after this commit, we keep our existing
behavior surrounding the `Complete` field as is. Otherwise, current
`lnd` clients which rely on this field (rather than the two
aforementioned fields) wouldn't be able to properly detect when a set of
responses to their query was "complete".
Partially fixes#3728.
The policy update logic that resided part in the gossiper and
part in the rpc server is extracted into its own object.
This prepares for additional validation logic to be added for policy
updates that would otherwise make the gossiper heavier.
It is also a small first step towards separation of our own channel data
from the rest of the graph.
As a preparation for making the gossiper less responsible for validating
and supplementing local channel policy updates, this commits moves the
on-the-fly max htlc migration up the call tree. The plan for a follow up
commit is to move it out of the gossiper completely for local channel
updates, so that we don't need to return a list of final applied policies
anymore.
In this commit, we fix a bug where if a user updates a forwarding policy to be
zero, the update will be applied to the policy correctly on-disk, but not
in-memory.
We solve this issue by having the gossiper return the list of on-disk updated
policies and passing these policies to the switch, so the switch can assume
that zero-valued fields are intentional and not just uninitialized.
There's no need to broadcast these as we assume that online nodes have
already received them. For nodes that were offline, they should receive
them as part of their initial graph sync.