Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport PVC Protection Feature Scheduler Changes #18110

Conversation

@openshift-merge-robot openshift-merge-robot added the vendor-update Touching vendor dir or related files label Jan 15, 2018
@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 15, 2018
@pospispa
Copy link
Author

@jsafrane PTAL

@pospispa
Copy link
Author

/test cmd

@pospispa
Copy link
Author

/test unit

@pospispa
Copy link
Author

flake: #17882

@pospispa
Copy link
Author

/test extended_conformance_install

@liggitt
Copy link
Contributor

liggitt commented Jan 16, 2018

this was an alpha-level feature in kube 1.9. we typically do not enable features until they reach beta status

@pospispa
Copy link
Author

@liggitt

this was an alpha-level feature in kube 1.9. we typically do not enable features until they reach beta status

The change in the scheduler is a bugfix because scheduler shouldn't schedule a pod that uses a PVC that is being deleted.

Different people have a different point of view on the fact that a PVC can be deleted even though it's in active use by a pod. Some people view it as a security bug, however, upstream views it as a new feature. So the new feature is actually a bugfix.

Because it's a security bug I've backported it here. And because of the backport I'm also backporting the change in the scheduler and enabling the possibility to switch on the PVC Protection feature.

I know, this is unusual.

@liggitt
Copy link
Contributor

liggitt commented Jan 16, 2018

It's only a security issue if you use the Recycle PV reclaim strategy to let pods reuse each the same PV, which has known issues and is not recommended. That doesn't seem a sufficient reason to backport an alpha feature

@pospispa
Copy link
Author

It's only a security issue if you use the Recycle PV reclaim strategy to let pods reuse each the same PV, which has known issues and is not recommended. That doesn't seem a sufficient reason to backport an alpha feature

@liggitt would you mind to split this discussion into two:

  1. discussion about backporting the PVC Protection alpha feature into a previous version of OpenShift. There is an open PR for this backport so I'd love to discuss the backport in it.
  2. discussion about this PR.

This PR consists of two commits that have different reasons:

Adding policy for the PVC Protection controller

My reasoning for this change is:

  1. K8s 1.9 contains all parts of the PVC Protection alpha feature (i.e. kubectl: Add Terminating state to PVCs kubernetes/kubernetes#55873 and Postpone Deletion of a Persistent Volume Claim in case It Is Used by a Pod kubernetes/kubernetes#55824) except the changes in the scheduler. Therefore, PRs kubectl: Add Terminating state to PVCs kubernetes/kubernetes#55873 and Postpone Deletion of a Persistent Volume Claim in case It Is Used by a Pod kubernetes/kubernetes#55824 are already a part of OpenShift 3.9.
  2. As PRs kubectl: Add Terminating state to PVCs kubernetes/kubernetes#55873 and Postpone Deletion of a Persistent Volume Claim in case It Is Used by a Pod kubernetes/kubernetes#55824 are already a part of OpenShift 3.9 it's necessary to add policy for the PVC Protection controller that is already present in OpenShift 3.9.

@liggitt please, do you agree with adding policy for the PVC Protection controller?

Backporting changes in the scheduler

My reasoning for this change is:
The change in the scheduler is a bugfix because scheduler must not schedule a pod that uses a PVC that is being deleted. The only scenario when this fix is useful is when the PVC Protection feature is enabled. I believe it's worth backporting the scheduler fix even though the fix is useful only in the before mentioned scenario.

@pospispa
Copy link
Author

/test extended_conformance_install

@liggitt
Copy link
Contributor

liggitt commented Jan 16, 2018

@liggitt please, do you agree with adding policy for the PVC Protection controller?

no, use the feature flag to enable the alpha feature if you want it, which automatically picks up the upstream role. In master-config.yaml:

kubernetesMasterConfig:
  ...
  apiServerArguments:
    ...
    feature-gates:
    - PVCProtection=true
  ...
  controllerArguments:
    feature-gates:
    - PVCProtection=true
  ...
  schedulerArguments:
    feature-gates:
    - PVCProtection=true
$ oc get clusterroles.rbac/system:controller:pvc-protection-controller -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  annotations:
    authorization.openshift.io/system-only: "true"
    rbac.authorization.kubernetes.io/autoupdate: "true"
  creationTimestamp: 2018-01-16T18:18:41Z
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:controller:pvc-protection-controller
  resourceVersion: "171"
  selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/system%3Acontroller%3Apvc-protection-controller
  uid: ae45d5e0-fae9-11e7-bbad-acbc32c1ca87
rules:
- apiGroups:
  - ""
  resources:
  - persistentvolumeclaims
  verbs:
  - get
  - list
  - update
  - watch
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - events
  verbs:
  - create
  - patch
  - update

I believe it's worth backporting the scheduler fix even though the fix is useful only in the before mentioned scenario.

I'd like to see the cherry-pick to 1.9 proposed upstream first. I'm skeptical about picking things to old releases in support of alpha features.

…leted

PVC Protection feature consists of the below PRs:
- kubernetes/kubernetes#55873
- kubernetes/kubernetes#55824
- kubernetes/kubernetes#55957

The PRs #55873 and #55824 were merged into K8s 1.9, however, the PR #55957 was merged into K8s 1.10. That's why this PR is backported.

This commit modifies scheduler in such a way that it does not schedule a pod that uses a PVC that is being deleted.
@pospispa pospispa force-pushed the 566-postpone-pvc-deletion-if-used-in-a-pod-backport-for-OpenShift3.9-specific-changes-and-backport-of-sheduler-changes branch from 9b671c7 to eeb1dbb Compare January 17, 2018 13:27
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pospispa
We suggest the following additional approver: smarterclayton

Assign the PR to them by writing /assign @smarterclayton in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@pospispa
Copy link
Author

no, use the feature flag to enable the alpha feature if you want it, which automatically picks up the upstream role.

@liggitt I didn't know that the upstream role is automatically picked up in case the alpha feature is enabled. Thank you for letting me know. So I deleted the commit that was "duplicating" the policy.

I'd like to see the cherry-pick to 1.9 proposed upstream first. I'm skeptical about picking things to old releases in support of alpha features.

I prefer to firstly finish discussion about backporting the PVC Protection feature into a previous version of OpenShift in this PR and then continue the discussion here.

@liggitt
Copy link
Contributor

liggitt commented Jan 17, 2018

just the scheduler pick makes more sense to me
/retest

@pospispa
Copy link
Author

just the scheduler pick makes more sense to me

So @liggitt do you agree with the scheduler pick into OpenShift 3.9 that will make the PVC Protection alpha feature "complete" in OpenShift 3.9?

I don't know whether you already reviewed the scheduler pick and if you have any comments.

@pospispa pospispa changed the title Backport PVC Protection Feature Scheduler Changes and Added PVC Protection Controller Policy Backport PVC Protection Feature Scheduler Changes Jan 19, 2018
@pospispa
Copy link
Author

@liggitt @jsafrane we agreed not to backport PVC Protection into previous OpenShift version. So argument for backporting the scheduler changes because of a backport to previous OpenShift version is off the table.

Let me think out loudly about pros for and cons against backporting the scheduler changes.

Pros:

  • the backporting work is done, PR is created.
  • scheduler changes improve user experience of the PVC Protection alpha feature.

Cons:

  • the scheduler changes are already in Kubernetes 1.10 so will naturally become a part of OpenShift 3.10.
  • the backport brings additional code into OpenShift 3.9 that will have to be tested, documented and maintained.
  • there is no request for the backport.

I personally tend to not backporting.

@liggitt @jsafrane please, let me know what do you think.

@jsafrane
Copy link
Contributor

ok, let's not pick fixes that are not in upstream 1.9
/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L Denotes a PR that changes 100-499 lines, ignoring generated files. vendor-update Touching vendor dir or related files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants