Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving the cri-dockerd repository as a resource for end-users #154

Closed
shu-mutou opened this issue Jan 31, 2023 · 53 comments
Closed

Improving the cri-dockerd repository as a resource for end-users #154

shu-mutou opened this issue Jan 31, 2023 · 53 comments

Comments

@shu-mutou
Copy link

sudo systemctl start cri-docker.service will always fails at first time.

Because sudo systemctl start cri-docker.service seems to create /etc/systemd/system/cri-docker.service.d/10-cni.conf, but the created file calls /usr/bin/cri-dockerd.

Despite the installation guide rewrite tha path to cri-dockerd. 😿

So we need to modify the path to the cri-dockerd in /etc/systemd/system/cri-docker.service.d/10-cni.conf after the first execution. Then run sudo systemctl daemon-reload && sudo systemctl restart cri-docker.

@shu-mutou
Copy link
Author

I don't know what timing, but 10-cni.conf seems to be restored. 😞
So in the installation guide, install ... should be set to /usr/bin/cri-dockerd and sed should not be executed.

@afbjorklund
Copy link
Contributor

afbjorklund commented Jan 31, 2023

I think this is referring to a known issue with minikube (there is no default CNI configuration bundled with cri-dockerd)

The main problem is that there is no user documentation, so people build from source instead of using the packages ?

Then minikube (and kubernetes.io) could use that, instead of the conflicting documentation (like it does today).

💡  Suggestion: 

    The none driver with Kubernetes v1.24+ and the docker container-runtime requires cri-dockerd.
    
    Please install cri-dockerd using these instructions:
    
    https://github.com/Mirantis/cri-dockerd#build-and-install

Probably it should just download the pre-compiled binaries, from https://github.com/Mirantis/cri-dockerd/releases

@shu-mutou
Copy link
Author

Thank you so much for your explanation. I understand the current situation.
But I don't really understand why the installation guide in this repository bothers to rewrite the installation destination, so I can't judge which is better.

@evol262
Copy link
Contributor

evol262 commented Jan 31, 2023

Packages are always preferable. The README contains a brief installation guide in case packages are not suitable for your distro (Alpine, Arch, whatever).

@afbjorklund what kind of docs do you think would work here? The project is small and focused enough that there's not a lot of meaningful user docs beyond installing, but I have worked on lot of projects, and I have never seen one before where a releases page is so consistently missed. I'm open to suggestions, including a giant header at the top of the readme.

@afbjorklund
Copy link
Contributor

afbjorklund commented Jan 31, 2023

But I don't really understand why the installation guide in this repository bothers to rewrite the installation destination,

The only important thing is that the path in the systemd units, matches the installation path of the binaries.
Normally the files in /usr/lib/systemd use /usr/bin and the files in /usr/local/lib/systemd use /usr/local/bin

But it becomes a problem, when you want to do local configuration (in /etc) - since it applies to both places...

One way out is to not include any path but let the system find the exe, but some people like full paths for security.

@evol262
Copy link
Contributor

evol262 commented Jan 31, 2023

Including systemd, as a note. ExecStart=... without an absolute path will raise an effort telling you very explicitly that absolute paths are needed.

@afbjorklund
Copy link
Contributor

afbjorklund commented Jan 31, 2023

@evol262 for me it would have been better if Kubernetes documented the basic CRI and CNI installation/configuration.

But it has been decided to delegate this to "third parties", and thus leave the whole thing as an exercise for the user.
One could compare with the containerd installation (but it doesn't include any packages), or the confusing cri-o docs.

For the minikube configuration, it could probably try some kind of which cri-dockerd and use that in the template...

@afbjorklund
Copy link
Contributor

afbjorklund commented Jan 31, 2023

I was referring to the fact that both crictl and kubeadm are broken, out-of-the-box. Needs to be configured. In YAML.
So maybe wasn't so surprising that the installation documentation wasn't coherent, or that CoreDNS is crash-looping ?

But the cri-dockerd installation guide should work as it is, assuming all local configuration is also updated to match.
Like if you move the installation to /usr/local/bin/cri-dockerd, the units need to be edited and the daemon reloaded.

@logopk
Copy link

logopk commented Feb 3, 2023

I'm getting here as I try to run minikube/kubernetes on VMWare Fusion on MacOS M1.

  • I have ditched local Docker for Mac a loong time ago in favor of VMWare what I use for other purposes
  • driver=vmware does not work as it ist amd64 only
  • driver=ssh or driver=none and a debian 11 bullseye VM will need cri-dockerd - which is amd64 only.

minikube with kubernetes uo to 1.23 runs fine in this setup.

Is there a way to get the build of cri-dockerd on debian running?

@evol262
Copy link
Contributor

evol262 commented Feb 3, 2023

cri-dockerd is not amd64 only. We even include arm64/aarch64 release artifacts. Granted, it is only packaged for amd64 DEB/RPM, but there's not a technical limitation. Just as it's not also a technical limitation for fusion (Apple Silicon can also do translation to x64, albeit with performance/battery life implications)

Other than building from source all the way, you can take that release tarball (which is the binary only), the systemd unit files from here, and pretty much follow the same steps as the bottom of the README:

install -o root -g root -m 0755 bin/cri-dockerd /usr/local/bin/cri-dockerd
cp -a packaging/systemd/* /etc/systemd/system
sed -i -e 's,/usr/bin/cri-dockerd,/usr/local/bin/cri-dockerd,' /etc/systemd/system/cri-docker.service
systemctl daemon-reload
systemctl enable cri-docker.service
systemctl enable --now cri-docker.socket

Technically, you don't need to install it in /usr/local/bin. It could just as easily be /usr/bin, and then the unit files would not need to be adjusted, but it's sort-of bad practice to have unmanaged files in /usr/bin, so pick a path appropriate for your environment.

@logopk
Copy link

logopk commented Feb 3, 2023

Oh, I see. The debian debs are only amd64. I'll try the other arm64...

@afbjorklund
Copy link
Contributor

afbjorklund commented Feb 4, 2023

Both the minikube virtual machine and the minikube container image are building cri-dockerd from source.

You could try opening an issue there, if for some reason minikube was not able to find dockerd and cri-dockerd ?

The workaround for the arm64 (with the current /usr/bin systemd template), would be to install the binaries in /usr.

It would be nice to have both amd64 and arm64 debs (and rpms), but that is a separate improvement request...

@evol262
Copy link
Contributor

evol262 commented Feb 4, 2023

It's not technically hard to do, really, if there's a desire.

The only reason it hasn't been is that it has not been clear that anyone would actually consume it (or ppcle, for instance). Particularly for RPMs, where there's not the kind of userbase there is around Armbian. While Fedora supports some ARM boards, and EL/Fedora support server-grade aarch64 hardware, there haven't been any requests or questions around aarch64 RPM builds, so 🤷

We can add arm64 debs on the next release. No armhf though, sorry. There's no technical reason there either, but the architecture support is being deprecated in upcoming Ubuntu versions, and I'm just as happy to skip it (nobody has asked for armhf builds here, either).

The "current systemd template" is not a template as-such. It is literally the unit file which is included in packages, which is why it is (and will stay) /usr/bin, with a copy+paste sed for users who want it elsewhere.

@afbjorklund
Copy link
Contributor

The main use cases would be people running VMs on Apple ARM, or running VMs on Cloud ARM (like Graviton)

Skipping arm32 is fine.

@afbjorklund
Copy link
Contributor

afbjorklund commented Feb 4, 2023

The "current systemd template" is not a template as-such. It is literally the unit file which is included in packages, which is why it is (and will stay) /usr/bin, with a copy+paste sed for users who want it elsewhere.

Sorry, I meant the minikube template mentioned above (it's for an override) - it had /usr/bin/cri-dockerd hardcoded

https://github.com/kubernetes/minikube/blob/v1.29.0/pkg/minikube/cruntime/docker.go#L715

@evol262
Copy link
Contributor

evol262 commented Feb 4, 2023

Graviton (or other enterprise aarch64 scenarios mostly on RPM-based distros, including Amazon Linux) is basically the "skip until requested case". The hobbyist "embedded board with a debian/ubuntu variant" case is large enough for aarch64 debs. Workloads on server-grade aarch64 are overwhelmingly likely to be using minishift, microk8s, a full-bore k8s solution, or their cloud-vendor's solution. Until there's an ask for it, it's vanishingly small.

Similarly, VMs on Apple silicon tend to be x64 with translation, and the page allocation changes in Apple Silicon from "standard" aarch64 are, generally, a case I'm happier avoiding unless someone explicitly asks. Sure, cri-dockerd (and Go, mostly) doesn't actually manually allocate, but the performance hit from wedging 4k pages in is anywhere from 16%-35% system-wide, and and some allocations can just plain cause segfaults.

"VMs on Apple Silicon" is and always will be a "best effort" scenario until when/if the Asahi team and kernel team manage to collaborate on dynamic page sizes in userspace or... something. The request here is for exactly that, but frankly, "debs for aarch64" is not a good story if the underlying platform (Linux on Apple Silicon, virtualized or not, depending on how the kernel for whatever distro was compiled and what the page size is) is not solid.

@logopk
Copy link

logopk commented Feb 4, 2023

Ok, I've got it running.

Now minikube wrecks it with installing a 10-cni.conf file every minikube start that links to /usr/bin
I assume that can be remediated temporarily by linking to /usr/local/bin/cri-dockerd
But then it's calling
sudo service cri-docker.socket restart that fails with status 5
Failed to restart cri-docker.socket.service: Unit cri-docker.socket.service not found.

I can reproduce this error calling service cri-docker.socket restart , however cri-docker.socket and cri-docker.service are up and running!

Anything I can do?

@afbjorklund
Copy link
Contributor

afbjorklund commented Feb 4, 2023

You need to make up your mind, either you call service (without .socket) or you call systemctl (with .socket) ?

Not sure that adding socket-activation was worth it, since everyone seems to bypass it anyway and start both...

systemctl enable cri-docker.service
systemctl enable --now cri-docker.socket

The .socket will start the .service (on-demand), so it is not needed as a separate call.

The vendor docker packages does not even have a .socket, but only a .service unit.

https://docs.docker.com/engine/install/linux-postinstall/#configure-docker-to-start-on-boot-with-systemd

 sudo systemctl enable docker.service
 sudo systemctl enable containerd.service

But it would be sad to have to remove it, just because systemd documentation is lacking.

@afbjorklund
Copy link
Contributor

afbjorklund commented Feb 4, 2023

Now minikube wrecks it with installing a 10-cni.conf file every minikube start that links to /usr/bin
I assume that can be remediated temporarily by linking to /usr/local/bin/cri-dockerd

This is a bug / missing feature, in minikube.

But then it's calling sudo service cri-docker.socket restart that fails with status 5
Failed to restart cri-docker.socket.service: Unit cri-docker.socket.service not found.

Was there a minikube bug report, about this ?

@evol262
Copy link
Contributor

evol262 commented Feb 4, 2023

You need to make up your mind, either you call service (without .socket) or you call systemctl (with .socket) ?

Not sure that adding socket-activation was worth it, since everyone seems to bypass it anyway and start both...

systemctl enable cri-docker.service
systemctl enable --now cri-docker.socket

The .socket will start the .service (on-demand), so it is not needed as a separate call.

The vendor docker packages does not even have a .socket, but only a .service unit.

They do, in fact, have a socket, and socket-based activation is not some strange thing -- it's one of the major advantages of systemd that socket-based activation is even possible.

Try systemctl list-units | grep docker and you will very clearly see docker.socket.

https://docs.docker.com/engine/install/linux-postinstall/#configure-docker-to-start-on-boot-with-systemd

 sudo systemctl enable docker.service
 sudo systemctl enable containerd.service

But it would be sad to have to remove it, just because systemd documentation is lacking.

The documentation for systemd is not lacking here. "Start the docker daemon at boot by enabling the service" and "start the docker daemon on demand if something tries to connect to the socket" are not orthogonal

@shu-mutou
Copy link
Author

@afbjorklund what kind of docs do you think would work here? The project is small and focused enough that there's not a lot of meaningful user docs beyond installing, but I have worked on lot of projects, and I have never seen one before where a releases page is so consistently missed. I'm open to suggestions, including a giant header at the top of the readme.

Perfectly agree. No matter how much I search the internet, I can't find an article that says it needs to be installed with a package.

Before Build and install section, A README explaining how to install the package would clear up any misunderstandings. Also, notice that the install path in Build and install section is customized, IMHO.

And minikube should use the path to cri-dockerd in /etc/systemd/system/cri-docker.service for 10-cni.conf, I think.

Thanks in advance!

@evol262
Copy link
Contributor

evol262 commented Feb 5, 2023

There's always a limit. GH Releases are common enough that "look in the right sidebar where there are clearly releases" can only be so obvious. It's not Sourceforge or an FTP server in 2007 with ./configure && make && make install.

Having build instructions in the README for users on Alpine, Arch, or some other distro who need to (or don't want a prebuilt tarball) is a tradeoff, and "here's now to install from a package" also ends up with "please package for this distro/that distro/etc"

@neersighted
Copy link
Collaborator

neersighted commented Feb 5, 2023

To be honest, I am mystified by this being a persistent problem. I know I'm speaking from a position of bias given my experience working in and with software, but to me, cri-dockerd is a very simple program. It's a Go module with the dependencies vendored in, so it builds like any other Go program written in the last several years.

Likewise, it is installed and then run via systemd unit like any other system service; even socket activation is a very common paradigm these days.

So I guess what makes me wonder is why, persistently, so many people come to the repo to build/install it from source while having no experience with Go software, systemd, and sometimes advanced features of their distro (e.g. SELinux). Is there anything that can be done to address that?

It seems like maybe a lot of novice engineers/programmers are trying to run (assemble their own Kubernetes) without having first learned to walk (write/use ecosystem-standard programs).

This isn't a commentary directly on this issue, which is more a unfortunate intersection of assumptions made by minikube with the implicit assumption that advanced users will either spot how they need to adapt immediately, or after the first failure.

Rather, there have been a lot of issues on the spectrum of moderate ecosystem knowledge required (this one), to the very basics of building & running Linux software not being understood (many more), and I wonder if anything can be done to make this situation better.

Maybe making the README a bit more novice-first/high-level, and minimizing/disclaiming the 'build & run' section as advanced/a demo of the basic principles that will not apply to every (or even most, given the diversity of Linux) systems explicitly (this is current implicit in my mind, and the mind of many contributors) can help.

@afbjorklund
Copy link
Contributor

The documentation for systemd is not lacking here. "Start the docker daemon at boot by enabling the service" and "start the docker daemon on demand if something tries to connect to the socket" are not orthogonal

You are right, my bad. Starting both of them at boot (enable) makes it start the service before there is any request for it, which does make sense if it is about to be used - and less sense if it is not, like sometimes in minikube.

@afbjorklund
Copy link
Contributor

afbjorklund commented Feb 5, 2023

So I guess what makes me wonder is why, persistently, so many people come to the repo to build/install it from source while having no experience with Go software, systemd, and sometimes advanced features of their distro (e.g. SELinux). Is there anything that can be done to address that?

All documentation was removed from Kubernetes, and deferred to the "third party" vendor. And this developer README is what was being provided, for the Docker runtime... (but I do think linking to it in the minikube code, was a mistake)

@neersighted
Copy link
Collaborator

Right, but I still don't understand why end-users are coming here... This is a upstream for several projects & products but has never tried to cater to end-users. Who is sending them here and why?

@afbjorklund
Copy link
Contributor

afbjorklund commented Feb 5, 2023

https://kubernetes.io/docs/setup/production-environment/container-runtimes/

https://minikube.sigs.k8s.io/docs/drivers/none/ (it does have some warnings, but)

❗  The 'none' driver is designed for experts who need to integrate with an existing VM

kubernetes/minikube@a7dc443

@neersighted
Copy link
Collaborator

neersighted commented Feb 5, 2023

(missed the edit)

Okay, so it sounds like minikube has some sort of support (at least to the point of having error messages/links/docs) for dockerd but doesn't actually ship a copy of cri-dockerd? Why was that decision made -- it seems rather user-hostile... The intention of this repo is to be a building block for Kubernetes distros and not to be a distributor of software/support to end-users, as I understand it.

@afbjorklund
Copy link
Contributor

afbjorklund commented Feb 5, 2023

I think cri-dockerd needs a website. (Even better would be if it was included in the Docker Engine web site, but anyway)

@afbjorklund
Copy link
Contributor

@afbjorklund
Copy link
Contributor

Minikube provides two options, one virtual machine and one container image, but the user also have the possibility of using their own VM. And then the "provisioner" is supposed to install the container runtime of their choice on it.

It was working OK with Docker Machine before (using get.docker.com), but has been broken since the introduction of other runtimes like crio/containerd and additional requirements like CRI and CNI - so right now, links are given...

@neersighted
Copy link
Collaborator

neersighted commented Feb 5, 2023

Hmm, I see. So minikube's no-VM mode uses cri-dockerd, and they're not distributing a copy because they don't want to become a multi-call binary with an embedded daemon or something else similarly obtuse.

Okay, so it seems like there are end-users coming here now, for better or worse, in large part because of the architectural decisions made by minikube. So ultimately some sort of user-focused documentation needs to exist to reduce the number of trivial questions that @evol262 ends up so patiently answering in this issue tracker, and to provide a simple set of directions for said audience.

I do think it's impossible to be all things to all people (as @evol262 points out, the line likely has to be drawn somewhere as Linux is simply too diverse; I think instructions for using packages & the current "how to build" as "advanced example (specifics may vary)" is sufficient), but there's certainly room to do better (and I have a better idea of how to approach it now that I see users are coming here from upstream Kubernetes/minikube docs).

I have a couple thoughts on the website aspect:

  • Maybe Docker Inc. wants to maintain something on docs.docker.com?
  • Maybe Mirantis can provide something on docs.mirantis.com?
  • (wildcard) Maybe cri-dockerd could have a better home in the Moby project and have docs on mobyproject.org?

I'll try to chase down some of the relevant parties for the above and see if there's a consensus on what might be best; I'm uniquely positioned to do so given the fact I have weekly meetings with all of the relevant people above 😆

The upstream Kubernetes/minikube docs are really poor and need to get better here as well, even a high-level overview of how to get cri-dockerd would be much more useful.

It also seems like minikube really should offer some way to vendor or fetch cri-dockerd and smooth this over for users. Granted the lifecycle & number of files might be rather complex given that minikube can't start/manage cri-dockerd given the process model, but they already have to do similar things for the kubelet...

If we had an in-repo build system for cri-dockerd & consistently provided/automatically built binary artifacts, it could possibly be as simple as retrieving & extracting a structured tar (/usr/local/bin/cri-dockerd, /usr/local/lib/systemd/system/cri-dockerd.{socket,service}) from this repo's release artifacts.

@evol262
Copy link
Contributor

evol262 commented Feb 5, 2023

An in-repo build system (whether goreleaser or other) is still both a problem for CGO_ENABLED builds (like downstream) without maintaining two separate build/release systems, and that doesn't really touch on the fact that binary artifacts and packages are already built but not found. Mangling it into a GH workflow is rearranging deckchairs.

It's fine if it does in the Moby project, but the Moby project isn't really doing anything with k8s these days, and It's not a good conceptual fit.

I firmly disagree about having a website. Upstream k8s docs (and docs in the k8s ecosystem in general) are not great from any direction, but that's not a new thing, and it's not new to k8s. Adding/configuring the CRI/CNI for containerd, crio, kata, cilium, longhorn, openebs, or any of the other parts of the ecosystem. This was also true for openvswitch/ovn, neutron, trying to configure a driver for nova (lxc, qemu+kvm), and on and on.

These are/were all components, and the documentation, such as it is, is focused around developers and integrators. Firecracker gets a lot of attention lately. User-facing documentation for how to actually use Firecracker or integrate it into some stack is basically non-existent. I am not saying that we could not/should not make an effort, but cri-dockerd also does not do anything without a k8s environment (or a Nomad CRI driver) talking to it.

Users who are ending up here are running into problems with something, and the question is what/where. minikube is probably some. Maybe kubespray for others. The question becomes "what k8s deployment tool/distribution leaves users to find this repo?" and provide better guidance there.

@neersighted
Copy link
Collaborator

neersighted commented Feb 5, 2023

Right, so my thought is mostly "minikube is sending end-users, who may not be experienced developers, here" and what we do about that. Ultimately I think that maintaining general instructions on how to get up and running with cri-dockerd as an end user who has to manually retrieve it is easier to maintain here, than it is on the minikube/K8s end.

In my mind, minikube should provide the binary for end-users, but that will likely take time (I just reviewed the none driver code and it seems... rough/difficult to extend right now). It seems pragmatic to offer a landing page here, or to definitively state that this repo is not for end-users, and to migrate any support-related issues to discussions/disclaim that loudly in the README (see what we do in moby/moby).

As the person who has to maintain the CGO_ENABLED=1 builds for Mirantis, I'm personally fine with a simpler build system for this repo. Something minimal to produce consistent binaries for end users to consume seems like it could be in-scope for this repo (and less work for you, ultimately) and as long as we stay buildable with go build (and a script to set the linker constants) it shouldn't be hard to adapt/consume in the downstream.

Basically what I'm saying is I don't think we should use the work that we have to do inside Mirantis to hold back the usability for the end-users who are coming here, however unfortunate that situation is.

Finally regarding Moby, I feel like that's kind-of the point -- this is the only K8s-related bit surrounding Moby at all these days, so finally having all of that code/knowledge consolodated in one place that is close to the engine might be beneficial. The follow-on thought is that if we do generate documentation, a neutral home like docs.mobyproject.org (a dream of mine is splitting the current Docker Docs into Docker Inc docs and Moby docs) would be a very ergonomic fit.

@neersighted
Copy link
Collaborator

neersighted commented Feb 5, 2023

Also from the components angle: Should we emphasize that Mirantis Container Runtime is "batteries included" with cri-dockerd and add documentation on using MCR's cri-dockerd support to docs.mirantis.com? I think whether or not that is productive is also partially dictated by what kind of minikube user is coming here: if they're all just looking to set up a test cluster without sandboxing it (Why? Installing K8s on your bare system is gross and going to end poorly) that may not be useful...

However, if people are using minikube + none as a way to try to deploy real clusters with more opinions/safeguards than kubeadm, maybe nudging people to MCR (and thus Mirantis docs that could be created + Mirantis support that is already in place) makes sense?

@neersighted neersighted changed the title The installation guide does not work. Improving the cri-dockerd repository as a resource for end-users Feb 5, 2023
@evol262
Copy link
Contributor

evol262 commented Feb 5, 2023

I don't think that "nudging" people to a commercial product from a GH page is great practice. minishift/microshift, k0s, microk8s, and a number of other more or less integrated dev solutions exist. It's just that none of them (other than Rancher, and they do bundle cri-dockerd) supports a configurable backend, really.

Additionally, any divergence in MCR makes it something other than a "batteries included" solution unless the release cadence is kept in lockstep. MCR does not, for example, work with k8s 1.26 right now, because the version of cri-dockerd included does not have cri.v1. MKE does, but MKE also explicitly bundles the cri-dockerd we need and leaves the MCR-packaged cri-dockerd alone.

Bluntly, goreleaser/nFPM are not good. It's a real life representation of the XKCD "standards" comic. It does not have a good way to represent complex dependency selection, and even if we discarded that, the difference between "binary releases are on this repository already and done when a new release is cut" and "binary releases are on this repository already and are triggered by GH (assuming the runners don't timeout/crash)" is effectively zero. Without a giant header at the top of the docs pointing users to the page, we're in the same scenario as now, and with that header, changing the build backend is six of one and half a dozen of the other.

This isn't an end-user usability problem. It is a "users who are not familiar with navigating GH and finding releases are ending up here" problem.

It is a question of "do we add a giant header? if so, what does it include?"

@logopk
Copy link

logopk commented Feb 5, 2023

Minikube provides two options, one virtual machine and one container image, but the user also have the possibility of using their own VM. And then the "provisioner" is supposed to install the container runtime of their choice on it.

It was working OK with Docker Machine before (using get.docker.com), but has been broken since the introduction of other runtimes like crio/containerd and additional requirements like CRI and CNI - so right now, links are given...

So forgive me for asking end user questions, and I appreciate your support! Maybe it‘s a question of a „header“ and a bit of better documentation. Debian arm64 packages would be helpful too.
But all this more or less has to be answered by the minikube team. Their remediation to the dockershim deprecation is not working in all (edge) cases.

In my case (ssh remote/vm, Debian bullseye on arm64) it breaks after upgrading from Kubernetes 1.23 and at first it is simply unclear why! After initial research and debugging I found the above links to cri-dockerd and now I‘m here…

Just luckily I had snapshots of my VM to get back to a working version…

@neersighted
Copy link
Collaborator

neersighted commented Feb 5, 2023

Re: MCR, the next release does support CRI v1... But things other than cri-dockerd drive the patch cadence of MCR, so perhaps it's not the best solution for "get the latest cri-dockerd," as it will never be /perfect/ given K8s and cri-dockerd move independently of the things MCR cares about. If you mean the MCR release cadence being async from Moby, I have some exciting news about that 😄

When I mention plugging MCR, I don't mean using it as a cop-out, but more along the lines of "If you are using cri-dockerd in development, there are some packages and instructions in this repo that are appropriate for advanced users. If you want an integrated development solution, Docker ships cri-dockerd as part of their Kubernetes support in Docker Desktop. If you're trying to use cri-dockerd in production, you may be interested in the commercial product/support that Mirantis provides."

I also think that minikube should step up to provide cri-dockerd instead of sending end-users here, but I don't think any of these approaches are mutually exclusive. At the end of the day, I don't think digging in and saying another party is the problem/we don't need to do anything is great; I think that everyone trying to get better incrementally is more pragmatic, and we can evolve things toward a cleaner separation of concerns over time.

Regarding automated builds, we don't have to use GHA or nFPM, there are other mechanisms (e.g. automate the current tooling in Jenkins). I think the main ask would be automated builds so that minikube could possibly depend on them. Also, GitHub releases allows for draft releases; CI easily could wait to tag + declare a release until it has confirmed all artifacts are correctly uploaded to the draft, and even perform smoke tests on them.

@logopk
Copy link

logopk commented Feb 5, 2023

@afbjorklund
Copy link
Contributor

afbjorklund commented Feb 5, 2023

Hmm, I see. So minikube's no-VM mode uses cri-dockerd, and they're not distributing a copy because they don't want to become a multi-call binary with an embedded daemon or something else similarly obtuse.

The libmachine "provisioner" is supposed to install the container runtime (including the CRI adaptor/plugin), but it was a bit near-sighted due to assuming that the runtime is Docker Engine and that there is a script.

https://github.com/docker/machine/blob/v0.16.2/libmachine/provision/provisioner.go#L74

https://github.com/docker/machine/blob/v0.16.2/libmachine/provision/utils.go#L27_L35

So in the current minikube fork of the provisioner, it is "assumed" that the VM already has the runtime...

https://github.com/kubernetes/minikube/blob/master/pkg/provision/ubuntu.go

https://github.com/kubernetes/minikube/blob/master/pkg/provision/buildroot.go

@neersighted
Copy link
Collaborator

Ah, here I thought this applied to the no-VM/none use case only. If the VM driver (with a Docker runtime) is similarly broken, I do think that represents something that needs a lot of improvement in minikube; I feel like this might not be hard as long as we have stable release artifacts on this repo (see #153, #148, #140).

@afbjorklund
Copy link
Contributor

afbjorklund commented Feb 5, 2023

The minikube VM does have all the container runtimes pre-installed, which sadly is another problem (of bloat)

It worked fine until HennyPenny, since before that you only needed CRI and CNI for the "other" container runtimes

Now all the Docker users also need to install the other three, and not just Engine.

@neersighted
Copy link
Collaborator

Okay, so if I follow correctly, the runtime is baked into the VM image and assumed to be there, so this sharp edge is happening to users using none/ssh as that assumption may not hold true given they are providing an arbitrary Linux userland?

@afbjorklund
Copy link
Contributor

afbjorklund commented Feb 5, 2023

Okay, so it seems like there are end-users coming here now, for better or worse, in large part because of the architectural decisions made by minikube.

This decision was not made by minikube, it was decided by upstream Kubernetes (the container runtime is no longer included in the box, it is "third party")

Before dockershim was removed, minikube was relying on Docker Machine to do the docker installation (without the CRI tools and without the CNI plugins).


The version of crictl and /opt/cni is now being tied harder to the Kubernetes release, so needs to be installed with it.
i.e each release of Kubernetes now needs a specific version of the tools, so it needs to be installed with k8s - not with OS

But we will still rely on the container runtime, to install the requirements and the CRI adapter/plugin/whatever. For minikube, we also assume that it has tools* for building and loading images (such as ctr/buildctl/nerdctl, or podman)

* tools already being provided by docker build and docker load

Currently building cri-dockerd from source, but want to use binaries...

https://github.com/kubernetes/minikube/blob/master/hack/update/cri_dockerd_version/update_cri_dockerd_version.sh

https://github.com/kubernetes/minikube/blob/master/deploy/iso/minikube-iso/arch/x86_64/package/cri-dockerd/cri-dockerd.mk

Simple binary tarballs on the GitHub release page is fine for it, however.

@evol262
Copy link
Contributor

evol262 commented Feb 5, 2023

Re: MCR, the next release does support CRI v1... But things other than cri-dockerd drive the patch cadence of MCR, so perhaps it's not the best solution for "get the latest cri-dockerd," as it will never be /perfect/ given K8s and cri-dockerd move independently of the things MCR cares about. If you mean the MCR release cadence being async from Moby, I have some exciting news about that 😄

I mean that, with 0.3.0, for example, k8s users/whatever need it now for k8s, not later.

When I mention plugging MCR, I don't mean using it as a cop-out, but more along the lines of "If you are using cri-dockerd in development, there are some packages and instructions in this repo that are appropriate for advanced users. If you want an integrated development solution, Docker ships cri-dockerd as part of their Kubernetes support in Docker Desktop. If you're trying to use cri-dockerd in production, you may be interested in the commercial product/support that Mirantis provides."

Upstream doesn't do that. A cri-dockerd.com page could talk about whatever we wanted with a small GH logo in the corner, but the other way doesn't go.

I also think that minikube should step up to provide cri-dockerd instead of sending end-users here, but I don't think any of these approaches are mutually exclusive. At the end of the day, I don't think digging in and saying another party is the problem/we don't need to do anything is great; I think that everyone trying to get better incrementally is more pragmatic, and we can evolve things toward a cleaner separation of concerns over time.

There's no disagreement there. The question is also about expenditure of effort. It's equally possible to do something (CI builds) which fundamentally does nothing to change the scenario. The only party which is meaningfully able to be blamed here is Github and the fact that the discoverability/user experience of finding releases is seemingly awful, but there's no way for us to do anything that without completely retheming it (with a GH pages landing page or whatever).

Regarding automated builds, we don't have to use GHA or nFPM, there are other mechanisms (e.g. automate the current tooling in Jenkins). I think the main ask would be automated builds so that minikube could possibly depend on them. Also, GitHub releases allows for draft releases; CI easily could wait to tag + declare a release until it has confirmed all artifacts are correctly uploaded to the draft, and even perform smoke tests on them.

It's exhausting to repeatedly have this conversation. That is not a solution. The reason why the current builds are not automated is that free GH runner are terrible, unreliable in general, have much worse reliability problems at certain times of the day, etc. It is a guaranteed timeout after 6 hours.

@afbjorklund
Copy link
Contributor

afbjorklund commented Feb 5, 2023

As far as I know, Mirantis Container Runtime is already mentioned separately in the upstream documentation:

https://kubernetes.io/docs/setup/production-environment/container-runtimes/

I think integrated developer solutions (DD) are out-of-scope for that particular page, maybe CNCF has a summary ?

https://landscape.cncf.io/card-mode?category=container-runtime&grouping=category


Some details on how to set it up with Docker Engine would probably be appreciated:

  1. On each of your nodes, install Docker for your Linux distribution as per Install Docker Engine.

  2. Install cri-dockerd, following the instructions in that source code repository.

https://github.com/kubernetes/website/blob/main/content/en/docs/setup/production-environment/container-runtimes.md

The same update could then be cherry-picked into the minikube "advice":

        NotFoundCriDockerd = Kind{
                ID:       "NOT_FOUND_CRI_DOCKERD",
                ExitCode: ExProgramNotFound,
                Advice: translate.T(`The none driver with Kubernetes v1.24+ and the docker container-runtime requires cri-dockerd.
                
                Please install cri-dockerd using these instructions:

                https://github.com/Mirantis/cri-dockerd#build-and-install`),
                Style: style.Docker,
        }
        NotFoundDockerd = Kind{
                ID:       "NOT_FOUND_DOCKERD",
                ExitCode: ExProgramNotFound,
                Advice: translate.T(`The none driver with Kubernetes v1.24+ and the docker container-runtime requires dockerd.
                
                Please install dockerd using these instructions:

                https://docs.docker.com/engine/install/`),
                Style: style.Docker,
        }

https://github.com/kubernetes/minikube/blob/master/pkg/minikube/reason/reason.go

With a longer text in the https://minikube.sigs.k8s.io/docs/drivers/none/ docs

  • a container runtime, such as Docker or CRIO
  • cri-dockerd (if using Kubernetes +v1.24 & docker container-runtime)

https://github.com/kubernetes/minikube/blob/master/site/content/en/docs/drivers/includes/none_usage.inc


So either update the existing: https://github.com/Mirantis/cri-dockerd#build-and-install

Or provide an alternate URL, that has the user documentation for cri-dockerd installation.

@neersighted
Copy link
Collaborator

neersighted commented Feb 5, 2023

@afbjorklund

This decision was not made by minikube, it was decided by upstream Kubernetes (the container runtime is no longer included in the box, it is "third party")

Before dockershim was removed, minikube was relying on Docker Machine to do the docker installation (without the CRI tools and without the CNI plugins).

Ah, my bad, I'm conflating different K8s SIGs here. The left-hand right-hand dance definitely explains how we got here.


@evol262

I mean that, with 0.3.0, for example, k8s users/whatever need it now for k8s, not later.

Sure, but if you /need/ the latest K8s today, and not in a month, you're probably an advanced user. My point is that for production use there are options out there that solve many of these problems for you.

Upstream doesn't do that. A cri-dockerd.com page could talk about whatever we wanted with a small GH logo in the corner, but the other way doesn't go.

I disagree. I think what we have on the Moby project is (out of date but) quite helpful/tasteful:

The Moby Project is intended for engineers, integrators and enthusiasts looking to modify, hack, fix, experiment, invent and build systems based on containers. It is not for people looking for a commercially supported system, but for people who want to work and learn with open source code.
[...]
The Moby project is not intended as a location for support or feature requests for Docker products, but as a place for contributors to work on open source code, fix bugs, and make the code more useful. The releases are supported by the maintainers, community and users, on a best efforts basis only, and are not intended for customers who want enterprise or commercial support; Docker EE is the appropriate product for these use cases.


There's no disagreement there. The question is also about expenditure of effort. It's equally possible to do something (CI builds) which fundamentally does nothing to change the scenario. The only party which is meaningfully able to be blamed here is Github and the fact that the discoverability/user experience of finding releases is seemingly awful, but there's no way for us to do anything that without completely retheming it (with a GH pages landing page or whatever).

Right, so what I'm proposing initially is just a rework of the README making the target audience for the repo, links to binaries/simple advice for common scenarios, and level of support (read: limited/none) front and center.

It's exhausting to repeatedly have this conversation. That is not a solution. The reason why the current builds are not automated is that free GH runner are terrible, unreliable in general, have much worse reliability problems at certain times of the day, etc. It is a guaranteed timeout after 6 hours.

I'm not sure why we go in circles on this either? Like I said, we can just use Jenkins to automate the current pipelines. The packaging that you cribbed for this repo is pretty difficult/painful to maintain, but I have a lot of experience with it at this point, and GHA or Jenkins, it's not terrible to run.


In any case, I think having CI builds is important because it would enable minikube to fetch cri-dockerd for the end user, if it is known that there are artifacts with stable naming and a stable layout. Also, the issues opened over the last couple months show that even users who are relying on the release artifacts have trouble because of the inconsistency of manual builds.

This isn't just "let's solve a social problem by building artifacts automatically," it's "let's allow for those sending us end-users to avoid sending them to us, by giving them a technical alternative." There are technical and non-technical solutions that can improve the state of the art here.

@shu-mutou
Copy link
Author

Right, so what I'm proposing initially is just a rework of the README making the target audience for the repo, links to binaries/simple advice for common scenarios, and level of support (read: limited/none) front and center.

That's really make sense!!


@evol262
I fully understand what you're thinking and saying, but it's not written anywhere in this repository. Also, it appears to have been recently packaged, and the README doesn't say it has changed as such. I mean, no one will notice that.

I would appreciate it if you could write what you wrote in this issue in the README.

I'm not saying you shouldn't care about other components like minikube using cri-dockerd at all. But just write down your recommended installation method and they will respect it and guide users.


I'm really sorry for confusing the discussion....
But thank you all!

@afbjorklund
Copy link
Contributor

afbjorklund commented Mar 12, 2023

Users are still struggling, but most of them have some weird cloud setup that isn't really supported anyhow...

https://github.com/kubernetes/minikube/issues?q=is%3Aissue+label%3Aco%2Fnone-driver+label%3Akind%2Fsupport

Only "half" have issues with the Docker runtime, the rest are still trying to find out this "new" CRI and CNI thing in 1.24.

The easiest fix would be to align with upstream Kubernetes, and have containerd as the (de facto) default runtime.

EDIT: Just remembered it wouldn't help anyway, since containerd cri plug-in is disabled in Docker configuration

And the 1.5 version woes, the cgroups v2 crashing, and the missing build support - so problems, either way

@satyampsoni
Copy link

Thank you so much for your explanation. I understand the current situation.
But I don't really understand why the installation guide in this repository bothers to rewrite the installation destination, so I can't judge which is better.

I agree!

@afbjorklund
Copy link
Contributor

Normally the rule is that if you install it yourself from source or from a tarball, it goes in /usr/local.

But if you install it from the regular package manager, such as deb/apt or rpm/yum, it goes in /usr.

@afbjorklund
Copy link
Contributor

afbjorklund commented Apr 2, 2023

I removed the #build-and-install from the minikube documenation links, hopefully that helps

Also added a check for crictl, since that is not included with the Minikube installation of k8s.

The deb/rpm for kubeadm will automatically pull it as a dependency, but not without packages*...


* Need to hardcode the version and then pull the tarball for the right arch and sudo install it.

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

CNI_PLUGINS_VERSION="v1.1.1"
ARCH="amd64"
DEST="/opt/cni/bin"
sudo mkdir -p "$DEST"
curl -L "https://github.com/containernetworking/plugins/releases/download/${CNI_PLUGINS_VERSION}/cni-plugins-linux-${ARCH}-${CNI_PLUGINS_VERSION}.tgz" | sudo tar -C "$DEST" -xz
DOWNLOAD_DIR="/usr/local/bin"
sudo mkdir -p "$DOWNLOAD_DIR"
CRICTL_VERSION="v1.25.0"
ARCH="amd64"
curl -L "https://github.com/kubernetes-sigs/cri-tools/releases/download/${CRICTL_VERSION}/crictl-${CRICTL_VERSION}-linux-${ARCH}.tar.gz" | sudo tar -C $DOWNLOAD_DIR -xz

For daemons, there is also the added hassle of installing the matching systemd unit files.

@nwneisen
Copy link
Collaborator

I've created two PRs which address many of the issues discussed in this ticket.

The first #214 rearranges the project's README for basic usage of the project first and advanced setups second. I would appreciate any of you who came to the project as an end user and were confused to please give it a read through and comment your thoughts. That way we can improve things for others in the future.

The second #215 sets up a static website for the project. This is not being hosted yet and so it requires you to manually run it on your machine or read through the files as markdown. This is just the initial setup and is likely missing pages but again I appreciate any comments on what could be added to make things easier for everyone.

I also have plans to automate the releases and put them on a regular cadence but there are some other issues with the project that I would like to fix first. Thank you everyone for your feedback on ways that we can improve the project.

@nwneisen nwneisen closed this as completed Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants