Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm add flag to enable parallel image pulling by default #2780

Closed
pacoxu opened this issue Nov 11, 2022 · 18 comments · Fixed by kubernetes/kubernetes#124442
Closed

kubeadm add flag to enable parallel image pulling by default #2780

pacoxu opened this issue Nov 11, 2022 · 18 comments · Fixed by kubernetes/kubernetes#124442
Assignees
Labels
kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@pacoxu
Copy link
Member

pacoxu commented Nov 11, 2022

    +0 to have tools such as `kubeadm` manage the value of that flag

Originally posted in kubernetes/kubernetes#108405 (comment)

This is a custom flag of kubelet and in rancher, they already switch to enable it by default.

Several ways we can promote parallel image pulling

  1. we may enable it by default without a flag
  2. we may use a flag to enable it, and now default to false, and change it to true once we think it is ready. (Or directly default to enable it, and leave a flag for users concerned about it to disable it.)
  3. we may always use the kubelet default value here and wait for sig-node decisions.

/kind feature


edit by neolit123

API change proposal here:
#2780 (comment)

what we can do in kubeadm is to add an API field that controls the kubeadm image pulls to be parallel (crictl ...), NOT the kubelet pulls.
it can be NodeRegistration.ImagePullSerial: true for Init|joinConfiguration (we already have imagePullPolicy there) and the default to be true, no such field will be in ResetConfiguration . kubeadm config images uses InitConfiguration.
for UpgradeConfiguration doesn't have NodeRegistration, so there we can have it a top level.

if people are +1 to the above we can add this new field to v1beta4.
if you don't like the ImagePullSerial name please suggest something else.

v1beta4 PRs

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 11, 2022
@neolit123
Copy link
Member

neolit123 commented Nov 11, 2022

do you think this is the default all users want?
i think its ok to add a flag that is true by default and also add a noderegistrationoptions field for api version "next" when we release it

@SataQiu wdyt?

@pacoxu
Copy link
Member Author

pacoxu commented Nov 11, 2022

https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/#create-the-config-file

Here is an example of what this file might look like:

apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
address: "192.168.0.8"
port: 20250
serializeImagePulls: false
evictionHard:
    memory.available:  "200Mi"

The serializeImagePulls: false example was added in https://github.com/kubernetes/website/pull/28830/files.

Already used in gce and e2e_node test:

https://github.com/kubernetes/kubernetes/blob/cf12a74b18b66efc577ec819c78a0c68f6d49225/cluster/gce/config-test.sh#L232

https://github.com/kubernetes/kubernetes/blob/cf12a74b18b66efc577ec819c78a0c68f6d49225/test/e2e_node/services/kubelet.go#L126

@SataQiu
Copy link
Member

SataQiu commented Nov 12, 2022

I'm +1 for adding a NodeRegistrationOptions field to control this feature.
Given that currently users can achieve it by setup KubeletExtraArgs, I'm not sure if we need to add a flag for kubeadm now.

@pacoxu
Copy link
Member Author

pacoxu commented Dec 14, 2022

do you think this is the default all users want?

My key point here is to set it by default like Rancher and some other kube-installers.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 14, 2023
@chendave
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 15, 2023
@pacoxu
Copy link
Member Author

pacoxu commented Mar 27, 2023

In kubernetes/kubernetes#114552, it was a try to change "serialize-image-pulls" from true to false.

Then we added a new feature kubernetes/enhancements#3713.

In the near future, we may be able to modify to parallel image pulling.

@frederiko
Copy link
Contributor

frederiko commented May 3, 2023

I'm +1 for adding a NodeRegistrationOptions field to control this feature. Given that currently users can achieve it by setup KubeletExtraArgs, I'm not sure if we need to add a flag for kubeadm now.

@pacoxu Is anyone working on this? I would like to take a peek to contribute back if that's okay.

@pacoxu
Copy link
Member Author

pacoxu commented May 6, 2023

@frederiko the final solution is not decided yet.

Either is OK now. I hope the 1 can happen soon. See more at kubernetes/kubernetes#108405 (comment).

  1. wait for sig-node status on Kubelet limit of Parallel Image Pulls enhancements#3673 or Parallel Image Pulls becomes a default behavior.
  2. as Lubomir and Sata pointed out, users can gain this with NodeRegistrationOptions.KubeletExtraArgs. This is the current status and a suggested way now.
  3. add a flag to disable it and enable Parallel Image Pulls by default: this is my proposal, but no agreement was reached.

@neolit123
Copy link
Member

since kubeadm shells to crictl during config images pull, init, join, upgrade, etc... we could add a kubeadm option in config and flag to opt out of parallel pulls.

at least this is my understanding? kubelet pulls are a separate feature and be controlled with kubelet config and flag.

@frederiko
Copy link
Contributor

@frederiko the final solution is not decided yet.

Either is OK now. I hope the 1 can happen soon. See more at kubernetes/kubernetes#108405 (comment).

  1. wait for sig-node status on Kubelet limit of Parallel Image Pulls enhancements#3673 or Parallel Image Pulls becomes a default behavior.

  2. as Lubomir and Sata pointed out, users can gain this with NodeRegistrationOptions.KubeletExtraArgs. This is the current status and a suggested way now.

  3. add a flag to disable it and enable Parallel Image Pulls by default: this is my proposal, but no agreement was reached.

I see. I may have misread it here. I thought a config would be added and a decision about leaving it on or off would be done at later point, following whichever direction sig-node goes, option 3 in such case.

@neolit123 neolit123 added the priority/backlog Higher priority than priority/awaiting-more-evidence. label May 24, 2023
@neolit123 neolit123 added this to the v1.28 milestone May 24, 2023
@pacoxu pacoxu modified the milestones: v1.28, v1.29 Jul 19, 2023
@neolit123 neolit123 modified the milestones: v1.29, v1.30 Nov 1, 2023
@neolit123
Copy link
Member

neolit123 commented Jan 2, 2024

looking at the history here again:
kubernetes/kubernetes#114552

i think we should not make kubeadm deploy a kubelet with KubeletCopnfiguration.serializeImagePulls = false. if the kubelet eventually decides to default to serializeImagePulls = false, all kubeadm users will benefit from that new default behavior. generaly kubeadm tries to deploy all components with their default configuration. there are a few minor exceptions, of course, like enabling the NodeRestriction controller, bootstrap tokens, and some other things.

changing defaults is not great, because you break an API contract - users with slow connections expect one pull at a time, but then they might not be happy with a parallel pull.


what we can do in kubeadm is to add an API field that controls the kubeadm image pulls to be parallel (crictl ...), NOT the kubelet pulls.
it can be NodeRegistration.ImagePullSerial: true for Init|joinConfiguration (we already have imagePullPolicy there) and the default to be true, no such field will be in ResetConfiguration . kubeadm config images uses InitConfiguration.
for UpgradeConfiguration doesn't have NodeRegistration, so there we can have it a top level.

if people are +1 to the above we can add this new field to v1beta4.
if you don't like the ImagePullSerial name please suggest something else.

@neolit123 neolit123 added the kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API label Jan 2, 2024
@pacoxu
Copy link
Member Author

pacoxu commented Jan 3, 2024

For kubelet side, we may wait for the progress and it may be a long journey. (I check with dragonfly https://github.com/dragonflyoss/Dragonfly2, a p2p image acceleration system will leave 5% of the max io for other system threads. In containerd or other container runtime, this is not supported yet.)

+1 for the proposal. Then we may rename this issue to be something like

`kubeadm config images pull` will use the new flag to decide to pull in parallel or not.

For parallel image pulling, the most concerned issue is that the disk will in heavy load if we pull many images at a time.

  • However, for kubeadm during init and join/upgrade, the node is already in a maintenance mode, and heavy io is acceptable.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 4, 2024
@neolit123
Copy link
Member

looks like we forgot to add ImagePull* controls to UpgradeConfiguration

https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/apis/kubeadm/v1beta4/types.go#L261-L271
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/apis/kubeadm/v1beta4/types.go#L732-L755

cc @calvin0327

currently we pull images on upgrade during preflight for "node" and "apply" so we should allow the same config as in "init" and "join", we should finish this for 1.31.

@neolit123 neolit123 removed this from the v1.30 milestone Apr 5, 2024
@neolit123 neolit123 added this to the v1.31 milestone Apr 5, 2024
@pacoxu
Copy link
Member Author

pacoxu commented Apr 7, 2024

Yes. NodeRegistration.imagePullPolicy is only used in init/join configuration.

/priority important-soon
/remove-priority backlog
Can we tag with important-soon as we will finish it in v1.31?

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/backlog Higher priority than priority/awaiting-more-evidence. labels Apr 7, 2024
@neolit123
Copy link
Member

neolit123 commented Apr 8, 2024

Yes. NodeRegistration.imagePullPolicy is only used in init/join configuration.

we shouldn't add noderegistration in upgrade config structs, but the two options for image policy and parallel pul can be either in:

  • top level upgrade config
  • only in "apply" and "node" configuration (i tend to prefer this option)

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 8, 2024
@pacoxu pacoxu removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants