Currently pods can be scheduled to master nodes #50

swade1987 · 2018-01-17T12:47:57Z

If the master node(s) have a NoSchedule taint the current deployment manifest would allow pods to be deployed on master nodes.

Assumptions:

You have a Kubernetes cluster running
The master node(s) kubelet config has the register-schedulable=true set.

The master nodes(s) have the following taint applied

kubectl taint nodes master1 node-role.kubernetes.io/master="":NoSchedule

Cordon all other nodes (excluding master nodes) in the cluster
```
kubectl cordon <node name>
```

Result:

coredns-77b5855fb7-f6ng7                 1/1       Running   0          7s        172.16.137.65    master1
coredns-77b5855fb7-pxfjh                 1/1       Running   0          7s        172.16.137.65    master1

Solution:

Edit the deployment and remove the following:

- key: node-role.kubernetes.io/master
  effect: NoSchedule

Delete all current coredns pods using:

kubectl delete pods -n kube-system -l k8s-app=coredns

When describing one of the pending pods, the description displays:

Warning  FailedScheduling  4s (x6 over 19s)  default-scheduler  0/8 nodes are available: 1 PodToleratesNodeTaints, 7 NodeUnschedulable

If you uncordon the worker nodes, the pods start to be scheduled correctly.

The text was updated successfully, but these errors were encountered:

chrisohaver · 2018-01-17T17:21:09Z

The master noSchedule taint toleration was replicated from the kube-dns deployment manifest in kubeadm.
I don't know the original reason for it, but it may be due to the fact that when building a cluster, initially there is only a master and no nodes, and cluster DNS may be needed for some base operation before nodes are added.

@luxas, do you know the original reason for adding the master taint toleration to the cluster dns service in kubeadm?

chrisohaver · 2018-01-19T18:24:16Z

@bowei, are you familiar with the reasoning behind adding the master taint toleration to kube-dns?

chrisohaver · 2018-01-25T18:27:19Z

@swade1987, in lieu of input from kubeadm or kube-dns, what are the reasons you think coredns should not be able to run on master node?

miekg · 2018-01-25T18:30:00Z

[ Quoting <notifications@github.com> in "Re: [coredns/deployment] Currently ..." ]

@swade1987, in lieu of input from kubeadm or kubernetes teams, what are the reasons you think coredns should _not_ be able to run on master node?

master is busy enough; adding something as dns on it as well, will almost certainly lead to sadness. (I have no idea how the k8s team came to the opposite conclusion)

chrisohaver · 2018-01-25T18:32:39Z

"almost certainly" is almost certainly an exageration... ;)

chrisohaver · 2018-01-25T18:47:01Z

It seems there is some inconsistency on this position within kubernetes.

For example: kubernetes/kubernetes#54945 is a request to add the master taint toleration to kube-dns... which means it wasn't there before. Though I think the addon directory that the PR modifies is deprecated... so the kube-dns manifests could have been out of date...

chrisohaver · 2018-01-25T19:01:52Z

Yeah - i confirmed the kubernetes/cluster/addons directory is "legacy" per its readme, and "deprecated" per the kubernetes addons web page.

Anyways, one argument to leave the toleration in place: In a scenario where coredns cannot for whatever reason be scheduled to a worker node, then running it on the master is better than not running it at all.

swade1987 · 2018-01-25T19:09:47Z

I always go with the separation of concerns mindset. Leave master nodes to be master nodes and act as just the kubernetes control plane.

miekg · 2018-01-25T19:10:54Z

[ Quoting <notifications@github.com> in "Re: [coredns/deployment] Currently ..." ]

I always go with the `separation of concerns` mindset. Leave master nodes to be master nodes and act as `just` the kubernetes control plane.

maybe if you scale down to 0 nodes you still want kubedns to run?

willvrny · 2018-01-25T19:13:19Z

Would it be better to have the default be the preferred practice e.g. separation of concerns therefore no master toleration and then document that if people want to schedule to master nodes to add the toleration?

johnbelamaric · 2018-01-25T19:14:01Z

Guess it's whether you consider service discovery a critical service that is part of the control plane or an add on extra thing. Most things won't work without it. So a PreferNoSchedule makes sense to me.

That said, these manifests are not gospel and can be tweaked for specific deployments.

bowei · 2018-01-25T20:05:01Z

The kube-dns manifests in the addons directory are current. It looks like the toleration is only in kubeadm.

Regarding where kube-dns runs, in general, the master node typically does not run things such as kube-proxy or kube-dns. In some deployments, the master node is not actually part of the normal cluster network. This means pods running on the master node will not be network reachable, hence services such as kube-dns won't work with pods scheduled on the master.

Of course for clusters where the master node(s) are not special, this can be tweaked.

chrisohaver · 2018-01-25T20:13:30Z

OK - I merged this. The kubeadm team can leave the taint toleration in if that's what they prefer.

One size isn't going to fit all. And I don't think we can even prescribe a "preferred" deployment manifest. There isn't a "typical" deployment. This deployment is just a suggestion. etc etc

swade1987 mentioned this issue Jan 17, 2018

Removing the master toleration to stop pods being deployed to master nodes #51

Merged

chrisohaver closed this as completed in #51 Jan 25, 2018

chrisohaver mentioned this issue Jan 25, 2018

Kubernetes manifest need rework #13

Closed

bhegazy mentioned this issue Oct 11, 2018

coredns should not be running on master by default kubernetes/kops#5917

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Currently pods can be scheduled to master nodes #50

Currently pods can be scheduled to master nodes #50

swade1987 commented Jan 17, 2018

chrisohaver commented Jan 17, 2018

chrisohaver commented Jan 19, 2018

chrisohaver commented Jan 25, 2018 •

edited

Loading

miekg commented Jan 25, 2018 via email

chrisohaver commented Jan 25, 2018

chrisohaver commented Jan 25, 2018

chrisohaver commented Jan 25, 2018

swade1987 commented Jan 25, 2018

miekg commented Jan 25, 2018 via email

willvrny commented Jan 25, 2018

johnbelamaric commented Jan 25, 2018

bowei commented Jan 25, 2018 •

edited

Loading

chrisohaver commented Jan 25, 2018

Currently pods can be scheduled to master nodes #50

Currently pods can be scheduled to master nodes #50

Comments

swade1987 commented Jan 17, 2018

chrisohaver commented Jan 17, 2018

chrisohaver commented Jan 19, 2018

chrisohaver commented Jan 25, 2018 • edited Loading

miekg commented Jan 25, 2018 via email

chrisohaver commented Jan 25, 2018

chrisohaver commented Jan 25, 2018

chrisohaver commented Jan 25, 2018

swade1987 commented Jan 25, 2018

miekg commented Jan 25, 2018 via email

willvrny commented Jan 25, 2018

johnbelamaric commented Jan 25, 2018

bowei commented Jan 25, 2018 • edited Loading

chrisohaver commented Jan 25, 2018

chrisohaver commented Jan 25, 2018 •

edited

Loading

bowei commented Jan 25, 2018 •

edited

Loading