Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empower servers to restrict metadata #262

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 16 additions & 1 deletion build/bazel/remote/execution/v2/remote_execution.proto
Original file line number Diff line number Diff line change
Expand Up @@ -2051,27 +2051,42 @@ message RequestMetadata {
// An identifier that ties multiple requests to the same action.
// For example, multiple requests to the CAS, Action Cache, and Execution
// API are used in order to compile foo.cc.
//
// Servers *MAY* impose restrictions on values of this field.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Restriction" makes it sound like bad values could make the RPC request fail- is that the intention? Or do you just want to make it clear that servers can choose to ignore any or all of these fields, for any reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, servers should be allowed to fail the requests (probably with INVALID_ARGUMENT) when clients send data they don't expect.

Note that there's an implicit case of servers doing that already: the metadata is sent as HTTP2 header, which have a max size. Servers already reject requests with, e.g., unreasonably long target labels.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, this is an interesting case. On the one hand, my experience tells me that it's useful for any given server deployment to require certain metadata, because it's really useful for metrics and debugging. On the other hand, I'm not sure that this restriction belongs in the protocol itself (and particularly not if it comes with increasingly restrictive, per-implementation requirements). Maybe we not some top-level guidance that servers that implement restrictions on metadata presence or format SHOULD provide a way to opt out at the deployment or instance_name level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's not feasible to document any concrete restrictions. Implementations will differ on what they allow.

I'm happy to replace the notes on every field with a general not that servers MAY impose restrictions.

Given that it's sent as gRPC metadata, opting out on instance_name doesn't seem feasible. Clients can send data that gRPC will just reject (e.g., if sending something invalid as that header and parsing fails). There's not much we can do about that, and servers should be able to protect against misbehaving clients even for metadata.

string action_id = 2;

// An identifier that ties multiple actions together to a final result.
// For example, multiple actions are required to build and run foo_test.
//
// This *SHOULD* generally be an RFC 4122-compliant UUID. Servers *MAY*
// impose additional restrictions on the values of this field.
string tool_invocation_id = 3;

// An identifier to tie multiple tool invocations together. For example,
// runs of foo_test, bar_test and baz_test on a post-submit of a given patch.
//
// This *SHOULD* generally be an RFC 4122-compliant UUID. Servers *MAY*
// impose additional restrictions on the values of this field.
string correlated_invocations_id = 4;

// A brief description of the kind of action, for example, CppCompile or GoLink.
// There is no standard agreed set of values for this, and they are expected to vary between different client tools.
// There is no standard agreed set of values for this, and they are expected to
// vary between different client tools.
//
// Servers *MAY* impose restrictions on values of this field.
string action_mnemonic = 5;

// An identifier for the target which produced this action.
// No guarantees are made around how many actions may relate to a single target.
//
// Servers *MAY* impose restrictions on values of this field.
string target_id = 6;

// An identifier for the configuration in which the target was built,
// e.g. for differentiating building host tools or different target platforms.
// There is no expectation that this value will have any particular structure,
// or equality across invocations, though some client tools may offer these guarantees.
//
// Servers *MAY* impose restrictions on values of this field.
string configuration_id = 7;
}