44f83731bc
This reworks the locking behavior of the Gossiper so that a race condition on channel updates and block notifications doesn't cause any loss of messages. This fixes an issue that manifested mostly as flakes on itests during WaitForNetworkChannelOpen calls. The previous behavior allowed ChannelUpdates to be missed if they happened concurrently to block notifications. The processNetworkAnnoucement call would check for the current block height, then lock the gossiper and add the msg to the prematureAnnoucements list. New blocks would trigger an update to the current block height then a lock and check of the aforementioned list. However, specially during itests it could happen that the missing lock before checking the height could case a race condition if the following sequence of events happened: - A new ChannelUpdate message was received and started processing on a separate goroutine - The isPremature() call was made and verified that the ChannelUpdate was in fact premature - The goroutine was scheduled out - A new block started processing in the gossiper. It updated the block height, asked and was granted the lock for the gossiper and verified there was zero premature announcements. The lock was released. - The goroutine processing the ChannelUpdate asked for the gossiper lock and was granted it. It added the ChannelUpdate in the prematureAnnoucements list. This can never be processed now. The way to fix this behavior is to ensure that both isPremature checks done inside processNetworkAnnoucement and best block updates are made inside the same critical section (i.e. while holding the same lock) so that they can't both check and update the prematureAnnoucements list concurrently. |
||
---|---|---|
.. | ||
bootstrapper.go | ||
chan_series.go | ||
gossiper_test.go | ||
gossiper.go | ||
log.go | ||
message_store_test.go | ||
message_store.go | ||
mock_test.go | ||
reliable_sender_test.go | ||
reliable_sender.go | ||
sync_manager_test.go | ||
sync_manager.go | ||
syncer_test.go | ||
syncer.go |