Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue
Description
A user reported the following error while running a federation locally in separated terminals:
It is a gRPC error, but it is hard to pin point exactly where the error came from (probably not a framework bug, as this is the first instance of such an issue, but it could be from the execution environment or some parts of the user's code).
Related issues/PRs
N/A
Proposal
Explanation
Setting the gRPC option
grpc.keepalive_permit_without_calls
to 0 insrc/py/flwr/server/fleet/grpc_bidi/grpc_server.py
fixed the issue for the user. This channel argument if set to 1 (0 : false; 1 : true), allows keepalive pings to be sent even if there are no calls in flight (from https://grpc.github.io/grpc/core/md_doc_keepalive.html) or in an other phrasing, tells the server "Is it permissible to send keepalive pings from the client without any outstanding streams" (from https://grpc.github.io/grpc/core/group__grpc__arg__keys.html#gaf900669f52f137677c4dbb9a7a902c92). In this PR we add this option by default for any gRPC server.Checklist
#contributions
)Any other comments?
A bit of background on the error and the proposed solution: https://stackoverflow.com/a/65994473
To quote from the link:
This is to safeguard from grpc clients streams who stay connected even when there is no data movements in the stream. When the stream is active and no data movement, the client keeps on pinging server to know whether its alive!! Thes continuous ping requests are basically abusive from servers point-of-view. When such a stale connection is detected, server sends this error and discontinues with client and the client is not able to further communicate or send any request.
But certainly some use cases arise when client want to stay connected for hours even if there is no incoming requests. Basically a stale stream.
In those cases both client and server has to configure themselves to allow such HTTP/2 Pings!
An other interesting explanation: https://stackoverflow.com/a/76327925