Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minikube start hangs on second consecutive run with kubernetes v1.12 #3284

Closed
sethp-nr opened this issue Oct 30, 2018 · 6 comments
Closed

minikube start hangs on second consecutive run with kubernetes v1.12 #3284

sethp-nr opened this issue Oct 30, 2018 · 6 comments
Labels
ev/hung-start kind/bug Categorizes issue or PR as related to a bug. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@sethp-nr
Copy link

Is this a BUG REPORT or FEATURE REQUEST?: Bug report

Environment:

  • Minikube version (use minikube version): v0.30.0
  • OS: macOS 10.13.6
  • VM Driver: virtualbox
  • ISO version: minikube-v0.30.0.iso
  • Others:
    $ minikube config view
    - WantReportError: true
    - bootstrapper: kubeadm
    - kubernetes-version: v1.12.1
    

What happened:

minikube start && minikube start would hang on the Starting cluster components... step the second time around.

Curiously, the same behavior does not occur with minikube start && minikube stop && minikube start.

What you expected to happen:

minikube start completes successfully.

How to reproduce it (as minimally and precisely as possible):

minikube start && minikube start

Output of minikube logs (if applicable):

This "operation not supported" seems suspect:

Oct 30 17:13:07 minikube kubelet[2741]: E1030 17:13:07.581295    2741 pod_workers.go:186] Error syncing pod 400930335566057521570dcbaf3dbb0b ("etcd-minikube_kube-system(400930335566057521570dcbaf3dbb0b)"), skipping: failed to "StartContainer" for "etcd" with ErrImagePull: "rpc error: code = Unknown desc = failed to register layer: Error processing tar file(exit status 1): operation not supported"

Indeed, docker pull appears broken inside my minikube VM:

$ docker pull k8s.gcr.io/etcd:3.2.24
3.2.24: Pulling from etcd
8c5a7da1afbc: Pull complete
0d363128e48e: Extracting [==================================================>]  56.51MB/56.51MB
1ba5e77f0f6e: Download complete
failed to register layer: Error processing tar file(exit status 1): operation not supported

Anything else do we need to know:

  • The second minikube start does not restart the VM, but it does restart docker. Upon restart, the backing store for the overlay2 driver has changed:
    $ docker info | grep "Backing"
     Backing Filesystem: extfs
    $ docker info | grep "Backing"
    Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
    $ docker info | grep "Backing"
     Backing Filesystem: ramfs
    
  • We suspected that the "not supported" operation might be a masked "out of space" issue with the ramfs & default memory size of 2gb. Unfortunately, I see the same result with minikube start --memory 8192 && minikube start, which suggests otherwise.
  • minikube delete does not always work to reset the state; we've observed a situation where various files owned by root are left dangling in the .minikube directory that cause VirtualBox all manner of trouble. However, I'm unable to reproduce this behavior with the latest version of VirtualBox (5.2.20 r125813) so perhaps this is just an unfortunate coincidence.
@balopat balopat added the kind/bug Categorizes issue or PR as related to a bug. label Oct 30, 2018
@balopat
Copy link
Contributor

balopat commented Oct 30, 2018

likely a duplicate of #2646

@tstromberg
Copy link
Contributor

Running minikube start twice on a running system has no defined behavior. We need to add an error when someone tries to do this - as mentioned in #2646. Is this a fair interpretation of your issue?

@sethp-nr
Copy link
Author

A descriptive error message would indeed have been an improvement, though it does make it more challenging to build tooling on top of minikube (like https://github.com/kubernetes-sigs/cluster-api-provider-aws).

The story here is that tool needs some sort of "seed" cluster to launch the initial resource(s) that it can pivot into self-management. For that zeroth disposable cluster (where relatively few external dependencies are desirable), minikube is a nice default choice. Unfortunately, clusterctl create cluster unconditionally runs minikube start, leading to the following comedy of errors:

  1. Hmm, minikube (via clusterctl) seems to be having trouble. Perhaps it's X?
  2. Let's change Y to test X. Ok, with Y changed it looks like minikube start works and leaves me with a happily running cluster.
  3. Now, we retry clusterctl – GOTO 1

An error message from step 1 saying "please stop before trying to run start" or so would've likely short-circuited that loop, but friendlier would have been to treat start as more of an "ensure running" verb. Through the narrow lens of this use-case, even better would be something like "issue an error iff the running cluster was started with different settings".

An alternative route I can imagine e.g. clusterctl taking might be to try and keep the minikube it uses and throws away partitioned away from the "system" minikube (with MINIKUBE_HOME and KUBECONFIG &c). Since minikube doesn't have any native concept of partitioning, though, I'd worry about getting everything separated, especially across the matrix of drivers + future project evolutions.

What do you think of those two ideas? Is there a different way to hammer this nail that might be more compatible with these projects' goals?

@tstromberg
Copy link
Contributor

@sethp-nr - That sounds reasonable. Based on testing, I believe we may have solved this issue in v0.33, but would like some confirmation that you are no longer seeing this issue.

In the longer term, we should support this workflow. I've opened #3578 to make sure we don't accidentally break this again in the future.

@tstromberg tstromberg added priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. triage/needs-information Indicates an issue needs more information in order to work on it. ev/hung-start labels Jan 23, 2019
@sethp-nr
Copy link
Author

@tstromberg I can confirm that minkube start && minkube start appears to leave me with a working minikube cluster. Thanks for the update!

@tstromberg
Copy link
Contributor

Excellent news. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ev/hung-start kind/bug Categorizes issue or PR as related to a bug. priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

3 participants