Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empower servers to restrict metadata #262

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Yannic
Copy link
Contributor

@Yannic Yannic commented Jun 13, 2023

Today, clients can send arbitrary strings in RequestMetadata. This can cause issues for servers if, e.g., clients send an invocation ID in a right-to-left language and servers log the invocation ID together with an (English) log message (invocation_id=שלום עולם!: Find missing blobs) or if the invocation ID is used as part of the path for storing metadata on disk.

Today, clients can send arbitrary strings in `RequestMetadata`. This can cause issues for servers if, e.g., clients send an invocation ID in a right-to-left language and servers log the invocation ID together with an (English) log message: `שלום עולם!: Find missing blobs`.
@Yannic Yannic requested a review from bergsieker as a code owner June 13, 2023 06:57
@@ -2051,27 +2051,42 @@ message RequestMetadata {
// An identifier that ties multiple requests to the same action.
// For example, multiple requests to the CAS, Action Cache, and Execution
// API are used in order to compile foo.cc.
//
// Servers *MAY* impose restrictions on values of this field.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Restriction" makes it sound like bad values could make the RPC request fail- is that the intention? Or do you just want to make it clear that servers can choose to ignore any or all of these fields, for any reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, servers should be allowed to fail the requests (probably with INVALID_ARGUMENT) when clients send data they don't expect.

Note that there's an implicit case of servers doing that already: the metadata is sent as HTTP2 header, which have a max size. Servers already reject requests with, e.g., unreasonably long target labels.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, this is an interesting case. On the one hand, my experience tells me that it's useful for any given server deployment to require certain metadata, because it's really useful for metrics and debugging. On the other hand, I'm not sure that this restriction belongs in the protocol itself (and particularly not if it comes with increasingly restrictive, per-implementation requirements). Maybe we not some top-level guidance that servers that implement restrictions on metadata presence or format SHOULD provide a way to opt out at the deployment or instance_name level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's not feasible to document any concrete restrictions. Implementations will differ on what they allow.

I'm happy to replace the notes on every field with a general not that servers MAY impose restrictions.

Given that it's sent as gRPC metadata, opting out on instance_name doesn't seem feasible. Clients can send data that gRPC will just reject (e.g., if sending something invalid as that header and parsing fails). There's not much we can do about that, and servers should be able to protect against misbehaving clients even for metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants