Skip to content

Commit

Permalink
distributed provisioning
Browse files Browse the repository at this point in the history
By putting external-provisioner onto each node and letting it
provision volumes directly on the node, we can remove the
controller/node communication part in PMEM-CSI. This solves various
issues in that part (race conditions that led to volume leaks) and
simplifies the deployment (no need for two-way TLS certificates
anymore).

The webhooks check for capacity by discovering the PMEM-CSI node pods
and retrieving metrics data from them via the normal metrics support.

The combination of node drivers from 0.8 with a controller from 0.9 is
harmless (no volume leaked) but can no longer create new
volumes. Existing volumes on the nodes are still usable.

Combining a controller from 0.8 with node drivers from 0.9 is more
problematic because the old controller will cause volume leaks when
volumes are deleted (intel#733).
If this is a problem, then the old StatefulSet can be deleted manually
before upgrading.

The operator and tests will be updated in separate commits.
  • Loading branch information
pohly committed Jan 15, 2021
1 parent d7f67e1 commit 60d8873
Show file tree
Hide file tree
Showing 89 changed files with 4,966 additions and 6,306 deletions.
5 changes: 1 addition & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -184,9 +184,6 @@ KUSTOMIZE += deploy/common/pmem-storageclass-cache.yaml=deploy/kustomize/storage
KUSTOMIZE += deploy/common/pmem-storageclass-late-binding.yaml=deploy/kustomize/storageclass-late-binding
KUSTOMIZE += deploy/operator/pmem-csi-operator.yaml=deploy/kustomize/operator

# Special one-off deployment with device mode = fake.
KUSTOMIZE += deploy/kubernetes-1.19/pmem-csi-fake.yaml=deploy/kustomize/kubernetes-base-fake

KUSTOMIZE_OUTPUT := $(foreach item,$(KUSTOMIZE),$(firstword $(subst =, ,$(item))))

# This function takes the name of a .yaml output file and returns the
Expand All @@ -204,7 +201,7 @@ $(KUSTOMIZE_OUTPUT): _work/kustomize $(KUSTOMIZE_INPUT)
mkdir -p ${@D}
$(call KUSTOMIZE_INVOCATION,$<,$@) >$@
if echo "$@" | grep '/pmem-csi-' | grep -qv '\-operator'; then \
dir=$$(echo "$@" | tr - / | sed -e 's;kubernetes/;kubernetes-;' -e 's;/alpha/;-alpha/;' -e 's/.yaml//' -e 's;/pmem/csi/;/;') && \
dir=$$(echo "$@" | tr - / | sed -e 's;kubernetes/;kubernetes-;' -e 's;/alpha/;-alpha/;' -e 's;/distributed/;-distributed/;' -e 's/.yaml//' -e 's;/pmem/csi/;/;') && \
mkdir -p $$dir && \
cp $@ $$dir/pmem-csi.yaml && \
echo 'resources: [ pmem-csi.yaml ]' > $$dir/kustomization.yaml; \
Expand Down
57 changes: 16 additions & 41 deletions deploy/bindata_generated.go

Large diffs are not rendered by default.

29 changes: 0 additions & 29 deletions deploy/crd/pmem-csi.intel.com_pmemcsideployments.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -55,12 +55,6 @@ spec:
spec:
description: DeploymentSpec defines the desired state of Deployment
properties:
caCert:
description: CACert encoded root certificate of the CA by which the
registry and node controller certificates are signed If not provided
operator uses a self-signed CA certificate
format: byte
type: string
controllerDriverResources:
description: ControllerDriverResources Compute resources required
by driver container running on master node
Expand Down Expand Up @@ -118,18 +112,6 @@ spec:
logLevel:
description: LogLevel number for the log verbosity
type: integer
nodeControllerCert:
description: NodeControllerCert encoded certificate signed by a CA
for node controller server authentication If not provided, provisioned
one by the operator using self-signed CA
format: byte
type: string
nodeControllerKey:
description: NodeControllerPrivateKey encoded private key used for
node controller server certificate If not provided, provisioned
one by the operator
format: byte
type: string
nodeDriverResources:
description: NodeDriverResources Compute resources required by driver
container running on worker nodes
Expand Down Expand Up @@ -232,17 +214,6 @@ spec:
to an implementation-defined value. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/'
type: object
type: object
registryCert:
description: RegistryCert encoded certificate signed by a CA for registry
server authentication If not provided, provisioned one by the operator
using self-signed CA
format: byte
type: string
registryKey:
description: RegistryPrivateKey encoded private key used for registry
server certificate If not provided, provisioned one by the operator
format: byte
type: string
type: object
status:
description: DeploymentStatus defines the observed state of Deployment
Expand Down
191 changes: 120 additions & 71 deletions deploy/kubernetes-1.17/direct/pmem-csi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,14 @@ metadata:
name: pmem-csi-controller
namespace: pmem-csi
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
pmem-csi.intel.com/deployment: direct-production
name: pmem-csi-webhooks
namespace: pmem-csi
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
Expand Down Expand Up @@ -64,6 +72,23 @@ rules:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
labels:
pmem-csi.intel.com/deployment: direct-production
name: pmem-csi-webhooks-cfg
namespace: pmem-csi
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- watch
- list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
Expand Down Expand Up @@ -139,6 +164,30 @@ rules:
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
pmem-csi.intel.com/deployment: direct-production
name: pmem-csi-webhooks-runner
rules:
- apiGroups:
- ""
resources:
- persistentvolumeclaims
verbs:
- get
- list
- watch
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
Expand All @@ -155,6 +204,22 @@ subjects:
namespace: pmem-csi
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
pmem-csi.intel.com/deployment: direct-production
name: pmem-csi-webhooks-role-cfg
namespace: pmem-csi
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: pmem-csi-webhooks-cfg
subjects:
- kind: ServiceAccount
name: pmem-csi-webhooks
namespace: pmem-csi
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
Expand All @@ -169,6 +234,21 @@ subjects:
name: pmem-csi-controller
namespace: pmem-csi
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
pmem-csi.intel.com/deployment: direct-production
name: pmem-csi-webhooks-role
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: pmem-csi-webhooks-runner
subjects:
- kind: ServiceAccount
name: pmem-csi-webhooks
namespace: pmem-csi
---
apiVersion: v1
kind: Service
metadata:
Expand Down Expand Up @@ -223,42 +303,27 @@ spec:
pmem-csi.intel.com/deployment: direct-production
pmem-csi.intel.com/webhook: ignore
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: pmem-csi.intel.com/controller
operator: NotIn
values:
- "no"
- "false"
containers:
- command:
- /usr/local/bin/pmem-csi-driver
- -v=3
- -logging-format=text
- -mode=controller
- -endpoint=unix:///csi/csi-controller.sock
- -registryEndpoint=tcp://0.0.0.0:10000
- -nodeid=$(KUBE_NODE_NAME)
- -mode=webhooks
- -schedulerListen=:8000
- -drivername=$(PMEM_CSI_DRIVER_NAME)
- -caFile=/certs/ca.crt
- -certFile=/certs/tls.crt
- -keyFile=/certs/tls.key
- -drivername=$(PMEM_CSI_DRIVER_NAME)
- -metricsListen=:10010
env:
- name: KUBE_NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
- name: TERMINATION_LOG_PATH
value: /tmp/termination-log
value: /dev/termination-log
- name: PMEM_CSI_DRIVER_NAME
value: pmem-csi.intel.com
- name: GODEBUG
value: x509ignoreCN=0
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
image: intel/pmem-csi-driver:canary
imagePullPolicy: IfNotPresent
name: pmem-driver
Expand All @@ -267,48 +332,21 @@ spec:
name: metrics
securityContext:
readOnlyRootFilesystem: true
terminationMessagePath: /tmp/termination-log
terminationMessagePath: /dev/termination-log
volumeMounts:
- mountPath: /certs
name: registry-cert
- mountPath: /csi
name: plugin-socket-dir
- mountPath: /tmp
name: tmp-dir
- args:
- -v=3
- --csi-address=/csi/csi-controller.sock
- --feature-gates=Topology=true
- --strict-topology=true
- --timeout=5m
- --default-fstype=ext4
- --metrics-address=:10011
image: k8s.gcr.io/sig-storage/csi-provisioner:v2.0.2
imagePullPolicy: IfNotPresent
name: external-provisioner
ports:
- containerPort: 10011
name: metrics
securityContext:
readOnlyRootFilesystem: true
volumeMounts:
- mountPath: /csi
name: plugin-socket-dir
name: webhook-cert
securityContext:
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: pmem-csi-controller
serviceAccountName: pmem-csi-webhooks
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
volumes:
- emptyDir: null
name: plugin-socket-dir
- name: registry-cert
- name: webhook-cert
secret:
secretName: pmem-csi-registry-secrets
- emptyDir: {}
name: tmp-dir
---
apiVersion: apps/v1
kind: DaemonSet
Expand Down Expand Up @@ -340,11 +378,6 @@ spec:
- -mode=node
- -endpoint=unix:///csi/csi.sock
- -nodeid=$(KUBE_NODE_NAME)
- -controllerEndpoint=tcp://$(KUBE_POD_IP):10001
- -registryEndpoint=tcp://pmem-csi-controller:10000
- -caFile=/certs/ca.crt
- -certFile=/certs/tls.crt
- -keyFile=/certs/tls.key
- -statePath=/var/lib/$(PMEM_CSI_DRIVER_NAME)
- -drivername=$(PMEM_CSI_DRIVER_NAME)
- -pmemPercentage=100
Expand All @@ -355,17 +388,10 @@ spec:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
- name: KUBE_POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: PMEM_CSI_DRIVER_NAME
value: pmem-csi.intel.com
- name: TERMINATION_LOG_PATH
value: /tmp/termination-log
- name: GODEBUG
value: x509ignoreCN=0
image: intel/pmem-csi-driver:canary
imagePullPolicy: IfNotPresent
name: pmem-driver
Expand All @@ -383,8 +409,6 @@ spec:
- mountPath: /var/lib/kubelet/pods
mountPropagation: Bidirectional
name: pods-dir
- mountPath: /certs
name: node-cert
- mountPath: /dev
name: dev-dir
- mountPath: /sys
Expand All @@ -411,8 +435,36 @@ spec:
name: socket-dir
- mountPath: /registration
name: registration-dir
- args:
- -v=3
- --csi-address=/csi/csi.sock
- --feature-gates=Topology=true
- --node-deployment=true
- --strict-topology=true
- --immediate-topology=false
- --timeout=5m
- --default-fstype=ext4
- --metrics-address=:10011
env:
- name: NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
image: gcr.io/k8s-staging-sig-storage/csi-provisioner:canary
imagePullPolicy: IfNotPresent
name: external-provisioner
ports:
- containerPort: 10011
name: metrics
securityContext:
readOnlyRootFilesystem: true
volumeMounts:
- mountPath: /csi
name: socket-dir
nodeSelector:
storage: pmem
serviceAccountName: pmem-csi-controller
volumes:
- hostPath:
path: /var/lib/kubelet/plugins/pmem-csi.intel.com
Expand All @@ -430,9 +482,6 @@ spec:
path: /var/lib/kubelet/pods
type: DirectoryOrCreate
name: pods-dir
- name: node-cert
secret:
secretName: pmem-csi-node-secrets
- hostPath:
path: /var/lib/pmem-csi.intel.com
type: DirectoryOrCreate
Expand Down
Loading

0 comments on commit 60d8873

Please sign in to comment.