-
Notifications
You must be signed in to change notification settings - Fork 995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict Dials From Discovery #14052
Conversation
…bs/geth-sharding into addSubnetDialBackoff
Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com>
func (s *Service) wantedPeerDials() int { | ||
maxPeers := int(s.cfg.MaxPeers) | ||
|
||
activePeers := len(s.Peers().Active()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would'nt be safer to convert activePeers
to uint
instead converting s.cfg.MaxPeers
to int
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The end result would be the same as these are all preconfigured values at startup. Any edge cases we hit here would require a user wildly misconfiguring their node.
// Do not excessively perform lookups if we constantly receive non-viable peers. | ||
if lookupCounter > backOffCounter { | ||
lookupCounter = 0 | ||
runtime.Gosched() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this call still useful with latest versions of golang?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why wouldn't it be useful with current versions of golang ?
* Fix Excessive Subnet Dials * Handle backoff in Iterator * Slow Down Lookups * Add Flag To Configure Dials * Preston's Review * Update cmd/beacon-chain/flags/base.go Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com> * Reduce polling period * Manu's Review --------- Co-authored-by: Preston Van Loon <pvanloon@offchainlabs.com> (cherry picked from commit 7a4ecb6)
This reverts commit 7a4ecb6.
hi, i've been working at the beach all summer and just saw this.. |
What type of PR is this?
Bug Fix
What does this PR do? Why is it needed?
There have been longstanding reports of windows users randomly losing peers over the years, this had been chalked down to an improperly configured firewall, timeserver or antivirus. Unfortunately over the years as the network has grown along with more intensive requirements in peer search for nodes, this has continued to happen more. It appears that a recent windows update has exacerbated this problem by quite a bit. This PR tries to fix this once and for all:
For discovery, we have an aggressive peer search algorithm. This was very useful in finding peers quickly for a recently booted node. However windows as an OS doesn't like the aggressive discovery search too much and randomly decides to block outgoing discovery packets from prysm. This PR makes it less aggressive, so it will take longer for nodes to find their desired number of peers
Add a flag to restrict the maximum amount of concurrent dials(
-max-concurrent-dials
). In the event users continue running into this, they can run with this flag at a smaller value(5,10, etc). This should stop triggering the OS at the cost of discovery being a lot slower along with possible reduced validator performance.Which issues(s) does this PR fix?
Fixes #14025 #13936 #13431 #8144
Other notes for review