CrashLoopBackOff pods after 'apply k8s' #2077

dpaks · 2018-12-11T15:14:43Z

Client Version: v1.13.0, GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: v1.13.0, GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
OS: Ubuntu 18
ksonnet version: 0.13.1
jsonnet version: v0.11.2
client-go version: kubernetes-1.10.4
argo: v2.2.1
kubeflow: v0.3.4
Env: A single node cluster using kubeadm

After issuing ${KUBEFLOW_SRC}/scripts/kfctl.sh apply k8s, I happen to see that few of the pods are in CrashLoopBackOff state.

kubeflow ambassador-9f48fcc6c-lfcxg 2/3 CrashLoopBackOff 39 3h21m
kubeflow ambassador-9f48fcc6c-lz7xn 2/3 CrashLoopBackOff 39 3h21m
kubeflow ambassador-9f48fcc6c-xd27x 2/3 CrashLoopBackOff 39 3h21m
kubeflow ml-pipeline-65dbcdc844-jmtjx 0/1 CrashLoopBackOff 30 3h20m
kubeflow ml-pipeline-persistenceagent-69bd5876df-nz9mg 0/1 CrashLoopBackOff 29 3h20m
kubeflow vizier-core-7ccdc5577-w92wk 0/1 CrashLoopBackOff 30 3h19m

There are no logs for them. When I described ambassador pod, I got the following.
Events:
Type Reason Age From Message

Warning BackOff 116s (x833 over 3h17m) kubelet, ubuntu-3 Back-off restarting failed container

Many solutions that I found online suggested adding a command to the docker yaml. Should I do as suggested or shall I ignore this issue?

royxue · 2018-12-12T06:38:48Z

@dpaks +1 the same problem
ambassador is fine for me,
but other 3 is keep cashloopbackoff

dpaks · 2018-12-12T09:00:26Z

@dpaks +1 the same problem
ambassador is fine for me,
but other 3 is keep cashloopbackoff

Can you give a +1 for the issue, so that this question catch dev attention?

royxue · 2018-12-20T10:12:31Z

@dpaks
Finally I have some time to look into this issue, I think this come with different causes.
For vizier-core, the problem is probably that your storage class, PV, PVC issues, which make the vizier db is on pending.
For ml-pipeline, hmmm I think the problem might be this

W1220 10:06:12.815524       1 client_config.go:552] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
W1220 10:06:12.816175       1 client_config.go:552] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.

dpaks · 2018-12-20T10:54:51Z

@royxue Yes, you're right. This doesn't have anything to do with Kubeflow. In my case, the pods were not able to reach outside the subnet. After resolving that, things worked fine.

dpaks closed this as completed Dec 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CrashLoopBackOff pods after 'apply k8s' #2077

CrashLoopBackOff pods after 'apply k8s' #2077

dpaks commented Dec 11, 2018

royxue commented Dec 12, 2018

dpaks commented Dec 12, 2018

royxue commented Dec 20, 2018

dpaks commented Dec 20, 2018

CrashLoopBackOff pods after 'apply k8s' #2077

CrashLoopBackOff pods after 'apply k8s' #2077

Comments

dpaks commented Dec 11, 2018

royxue commented Dec 12, 2018

dpaks commented Dec 12, 2018

royxue commented Dec 20, 2018

dpaks commented Dec 20, 2018