Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't use insecure registry when use kubectl apply #9371

Closed
foxracle opened this issue Oct 9, 2022 · 8 comments
Closed

can't use insecure registry when use kubectl apply #9371

foxracle opened this issue Oct 9, 2022 · 8 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@foxracle
Copy link

foxracle commented Oct 9, 2022

Environment:

  • Cloud provider or hardware configuration:
    bare-metal

  • OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):
    Linux 5.4.212-1.el7.elrepo.x86_64 x86_64
    CentOS Linux release 7.8.2003 (Core)

  • Version of Ansible (ansible --version):
    ansible [core 2.12.5]

  • Version of Python (python --version):
    Python 3.10.6
    Kubespray version (commit) (git rev-parse --short HEAD):
    425e202 (release-2.20)

Network plugin used:
calico

Command used to invoke ansible:
ansible-playbook -i inventory/sample/hosts.yaml --become --become-user=root cluster.yml

we deployed the kubernetes with containerd+offlinerepo+insecure registry, we can setup the k8s cluster successfully, but when use kubectl apply to test some simple kubernetes manifests, it can not work.
config of offline Private Container Image Registry is:
registry_host: "10.25.x.x"

the output of describe pod:
Failed to pull image "10.25.x.x/library/centos:centos7": rpc error: code = Unknown desc = failed to pull and unpack image "10.25.x.x/library/centos:centos7": failed to resolve reference "10.25.x.x/library/centos:centos7": failed to do request: Head "https://10.25.x.x/v2/library/centos/manifests/centos7": x509: certificate relies on legacy Common Name field, use SANs instead
Error: ErrImagePull
Error: ImagePullBackOff

the config of insecure registry is :
containerd_insecure_registries:
"10.25.x.x:80": "http://10.25.x.x:80"

containerd_registry_auth:

  • registry: 10.25.x.x:80
    username: admin
    password: xxxx

when I test the following command on one the master node, and we can see the warning message WARN[0015] skipping verifying HTTPS certs for "10.25.x.x" . everything is ok as expected. pull Succeeded and Login Succeeded
/usr/local/bin/nerdctl pull 10.25.x.x/library/centos:centos7
/usr/local/bin/nerdctl login 10.25.x.x
/usr/local/bin/nerdctl login 10.25.x.x:443

my question is that: what is the difference between the process of kubectl apply and the process of setup cluster (including nerdctl directly) to pull images? and how to fix the issue.

@foxracle foxracle added the kind/bug Categorizes issue or PR as related to a bug. label Oct 9, 2022
@titaneric
Copy link
Contributor

Hi, you may provide your own imagePullSecrets for you private registry, as mentioned in k8s docs, you could also add this new created imagePullSecrets and registered in your serviceAccount (k8s docs).

You may need to check your private registry certificates location and validity (harbor registry, docker client)

@foxracle
Copy link
Author

Hi, you may provide your own imagePullSecrets for you private registry, as mentioned in k8s docs, you could also add this new created imagePullSecrets and registered in your serviceAccount (k8s docs).

You may need to check your private registry certificates location and validity (harbor registry, docker client)

Thank you for reply.
all images needed during setup k8s cluster with kubespray and the images we tested with kubectl apply are in the same harbor registry. they share the same auth and [insecure_skip_verify=true] configuration provided in the yaml with the containerd_registry_auth and containerd_insecure_registries. if we should provided imagePullSecrets for kubectl apply, why all works during cluster setup without providing imagePullSecrets.
we think that it is the problem of insecure_skip_verify flag, but we do not know how to fix. we already have checked this issue , they are not same.

It seems that the insecure_skip_verify flag works during cluster setup process and command line with nerdctl but not kubectl apply. this is very strange.

By the way, with regard to the error[x509: certificate relies on legacy Common Name field, use SANs instead], I have reissue the certificated for harbor's IP, now the error is : [failed to do request: Head "https://10.25.6.13/v2/library/centos/manifests/centos7": x509: certificate signed by unknown authority].

the problem is that with [insecure_skip_verify=true], they should pull the images with auth without validating the ssl certificates.

@foxracle
Copy link
Author

foxracle commented Oct 11, 2022

we fixed the problem, the root cause is that [insecure_skip_verify=true] config does not work.

here is the config about insecure registries for containerd.

containerd_insecure_registries:
  "10.25.x.y": "https://10.25.x.y"

The rendered /etc/containerd/config.toml contains:

[plugins."io.containerd.grpc.v1.cri".registry]
      [plugins."io.containerd.grpc.v1.cri".registry.mirrors]
        [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
          endpoint = ["https://registry-1.docker.io"]
        [plugins."io.containerd.grpc.v1.cri".registry.mirrors."10.25.x.y"]
          endpoint = ["https://10.25.x.y"]
        [plugins."io.containerd.grpc.v1.cri".registry.configs."https://10.25.x.y".tls]
          insecure_skip_verify = true

but it needs to be:

        [plugins."io.containerd.grpc.v1.cri".registry.configs."10.25.x.y".tls]
          insecure_skip_verify = true

I fixed the problem by modifing the template file: roles/container-engine/containerd/templates/config.toml.j2

diff --git a/roles/container-engine/containerd/templates/config.toml.j2 b/roles/container-engine/containerd/templates/config.toml.j2
index 7ffe3704..7ea61245 100644
--- a/roles/container-engine/containerd/templates/config.toml.j2
+++ b/roles/container-engine/containerd/templates/config.toml.j2
@@ -56,7 +56,7 @@ oom_score = {{ containerd_oom_score }}
           endpoint = ["{{ ([ addr ] | flatten ) | join('","') }}"]
 {% endfor %}
 {% for addr in containerd_insecure_registries.values() | flatten | unique %}
-        [plugins."io.containerd.grpc.v1.cri".registry.configs."{{ addr }}".tls]
+        [plugins."io.containerd.grpc.v1.cri".registry.configs."{{ addr | urlsplit('netloc') }}".tls]
           insecure_skip_verify = true
 {% endfor %}
 {% endif %}

it seems that #9207 does not fix the problem about insecure registries for containerd.
here is explaination.
according to the official documents about /etc/containerd/config.toml.

Registry Endpoint
The endpoint is a list that can contain multiple image registry URLs split by commas. so the containerd_insecure_registries.values() should begin with http|https.

insecure_skip_verify

# explicitly use v2 config format
version = 2

# The registry host has to be a domain name or IP. Port number is also
# needed if the default HTTPS or HTTP port is not used.
[plugins."io.containerd.grpc.v1.cri".registry.configs."my.custom.registry".tls]
    ca_file   = "ca.pem"
    cert_file = "cert.pem"
    key_file  = "key.pem"

for the registry endpoint located at https://my.custom.registry, the registry.configs should specify the registry host my.custom.registry not the whole endpoint format: https://my.custom.registry.

[plugins."io.containerd.grpc.v1.cri".registry.configs."my.custom.registry".tls]
  insecure_skip_verify = true

@titaneric
Copy link
Contributor

Great work! Perhaps you can submit your patching to the PR.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 13, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 12, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 14, 2023
@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
4 participants