KEP 2527: Clarify meaning of `status` #2537

thockin · 2021-02-22T22:38:52Z

Since basically the beginning of Kubernetes, we've had this sort of "litmus
test" for status fields: "If I erased the whole status struct, would everything
in it be reconstructed (or at least be reconstructible) from observation?". The
goal was to ensure that the delineation between "what I asked for" and "what it
actually is" was clear and to encourage active reconciliation of state.

Many of our APIs pass this test (sometimes we fudge it and say yes "in
theory"), but not all of them do. This KEP proposes to clarify or remove this
guidance, especially as it pertains to state that is not derived from
observation.

One of the emergent uses of the spec/status split is access control. It is
assumed that, for most resources, users own (can write to) all of spec and
controllers own all of status, and not the reverse. This allows patterns like
Services which set spec.type: LoadBalancer, where the controller writes the
LB's IP address to status, and kube-proxy can trust that IP address (because it
came from a controller, not a user). Compare that with Services which use
spec.externalIPs. The behavior is kube-proxy is roughly the same, but
because non-trusted users can write to spec.externalIPs and that does not
require a trusted controller to ACK, that behavior was declared a CVE.

This KEP further proposes to add guidance for APIs that want to implement an
"allocation" or "binding" pattern which requires trusted ACK.

KEP: #2527

keps/sig-architecture/2527-clarify-status-observations-vs-rbac/README.md

dims · 2021-03-12T21:50:54Z

cc @palnabarun @nikhita

keps/sig-architecture/2527-clarify-status-observations-vs-rbac/README.md

embano1 · 2021-06-03T15:38:19Z

Since I'm (currently) in the "option 1" camp I was wondering if we could use annotations to track non-observable state there instead of diluting status? Just throwing this out here as I did not see it mentioned in the KEP (but not sure if it was discussed already and I'm just missing context).

thockin · 2021-06-07T23:45:33Z

@embano1 In general we try avoid enshrining non-primitive APIs as annotations. It just offloads the problem of decoding onto users. In this case, I don't think it helps because annotations are not RBAC'ed individually. A user could set "sensitive" values.

thockin · 2021-06-07T23:51:03Z

Hi all. I made some minor updates to this, but there has not been much feedback so far. Would love to hear more thoughts.

keps/sig-architecture/2527-clarify-status-observations-vs-rbac/README.md

liggitt

I think this makes sense... I don't think this scuttles any future plans to routinely discard status subtrees of objects or anything.

I expect some people have been writing controllers that operate as one-way "take observed state and update status" because of this guidance will now feel free to branch out into more creative/complex paths (spec → status pre-write → other object → status post-write, etc).

As we relax this, it would be good to give pretty crisp examples/guidance/patterns for writing state to status to help those folks stay re-entrant and robust in the face of conflicts/errors.

thockin · 2021-10-22T00:31:18Z

I fixed the small nit. I think the text you want actually goes in the impl, not the KEP? As such, PTAL at this.

liggitt · 2021-10-22T18:44:44Z

I'm happy with the direction, and the details can fall in the actual docs PR. I do think it's worth at least mentioning in this KEP the risks that working from status as a data source introduces, even if the detailed recommendations / best practices / examples are in the docs PR.

Since basically the beginning of Kubernetes, we've had this sort of "litmus test" for status fields: "If I erased the whole status struct, would everything in it be reconstructed (or at least be reconstructible) from observation?". The goal was to ensure that the delineation between "what I asked for" and "what it actually is" was clear and to encourage active reconciliation of state. Many of our APIs pass this test (sometimes we fudge it and say yes "in theory"), but not all of them do. This KEP proposes to clarify or remove this guidance, especially as it pertains to state that is not derived from observation. One of the emergent uses of the spec/status split is access control. It is assumed that, for most resources, users own (can write to) all of spec and controllers own all of status, and not the reverse. This allows patterns like Services which set `spec.type: LoadBalancer`, where the controller writes the LB's IP address to status, and kube-proxy can trust that IP address (because it came from a controller, not a user). Compare that with Services which use `spec.externalIPs`. The behavior is kube-proxy is roughly the same, but because non-trusted users can write to `spec.externalIPs` and that does not require a trusted controller to ACK, that behavior was declared a CVE. This KEP further proposes to add guidance for APIs that want to implement an "allocation" or "binding" pattern which requires trusted ACK.

thockin · 2021-10-22T21:59:11Z

Added, stole some of your words.

liggitt · 2021-10-22T22:01:28Z

/lgtm
/approve

thockin · 2021-10-22T22:31:59Z

@derekwaynecarr @johnbelamaric @dims - you all are listed as sig-arch leads - please approve?

dims · 2021-12-09T17:26:51Z

/approve

thanks for doing this! +1 to tune our guidance based on what we are observing in the field.

k8s-ci-robot · 2021-12-09T17:27:16Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims, liggitt, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/sig-architecture/OWNERS~~ [dims]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Since basically the beginning of Kubernetes, we've had this sort of "litmus test" for status fields: "If I erased the whole status struct, would everything in it be reconstructed (or at least be reconstructible) from observation?". The goal was to ensure that the delineation between "what I asked for" and "what it actually is" was clear and to encourage active reconciliation of state. Many of our APIs pass this test (sometimes we fudge it and say yes "in theory"), but not all of them do. This KEP proposes to clarify or remove this guidance, especially as it pertains to state that is not derived from observation. One of the emergent uses of the spec/status split is access control. It is assumed that, for most resources, users own (can write to) all of spec and controllers own all of status, and not the reverse. This allows patterns like Services which set `spec.type: LoadBalancer`, where the controller writes the LB's IP address to status, and kube-proxy can trust that IP address (because it came from a controller, not a user). Compare that with Services which use `spec.externalIPs`. The behavior is kube-proxy is roughly the same, but because non-trusted users can write to `spec.externalIPs` and that does not require a trusted controller to ACK, that behavior was declared a CVE. This KEP further proposes to add guidance for APIs that want to implement an "allocation" or "binding" pattern which requires trusted ACK.

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 22, 2021

thockin assigned liggitt Feb 22, 2021

k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Feb 22, 2021

k8s-ci-robot requested review from derekwaynecarr and dims February 22, 2021 22:39

k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. labels Feb 22, 2021

thockin requested review from lavalamp, deads2k, smarterclayton, johnbelamaric and derekwaynecarr and removed request for dims and derekwaynecarr February 22, 2021 22:39

wojtek-t reviewed Feb 23, 2021

View reviewed changes

keps/sig-architecture/2527-clarify-status-observations-vs-rbac/README.md Outdated Show resolved Hide resolved

johnbelamaric reviewed May 8, 2021

View reviewed changes

keps/sig-architecture/2527-clarify-status-observations-vs-rbac/README.md Outdated Show resolved Hide resolved

thockin force-pushed the 2527-clarify-status-observations-vs-rbac branch from c34280c to d9eff94 Compare June 7, 2021 23:49

thockin changed the title ~~KEP 2527 draft 1: Clarify meaning of status~~ KEP 2527: Clarify meaning of status Jun 7, 2021

thockin force-pushed the 2527-clarify-status-observations-vs-rbac branch from d9eff94 to 59f9b7f Compare June 7, 2021 23:55