Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempts to connet to a non-running server spam it with connect attempts #3487

Closed
smurfix opened this issue Feb 17, 2020 · 2 comments · Fixed by #3496
Closed

Attempts to connet to a non-running server spam it with connect attempts #3487

smurfix opened this issue Feb 17, 2020 · 2 comments · Fixed by #3496

Comments

@smurfix
Copy link
Contributor

smurfix commented Feb 17, 2020

This error does not make sense, on the face of it – either you time out, implying that there was no answer, or the connection was rejected, implying that the server told you there's no listener.

OSError: Timed out trying to connect to 'tcp://127.0.0.1:60590' after 10 s: in <distributed.comm.tcp.TCPConnector object at 0x7fce383a1ef0>: ConnectionRefusedError: [Errno 111] Connection refused

What actually happens is that distributed spams the server with connect requests for ten seconds, all of which get rejected. This is not a good idea. At all. Please either use a reasonable back-off or simply believe the server when it says ECONNREFUSED.

@mrocklin
Copy link
Member

Thank you for the issue @smurfix

Please either use a reasonable back-off

This seems reasonable to me. Is this something that you would be interested in contributing?

or simply believe the server when it says ECONNREFUSED.

Sometimes the server comes online during the connection process. This happens frequently in Kubernetes situations where the address genuinely may not have existed a short while ago.

smurfix added a commit to smurfix/distributed that referenced this issue Feb 18, 2020
@smurfix
Copy link
Contributor Author

smurfix commented Feb 18, 2020

Exponential backoff is easy … PR created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants