Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert kubelet server tls bootstrap #1248

Merged
merged 2 commits into from
Jul 15, 2020
Merged

Revert kubelet server tls bootstrap #1248

merged 2 commits into from
Jul 15, 2020

Conversation

jenting
Copy link

@jenting jenting commented Jul 13, 2020

Why is this PR needed?

The kucero will not officially release at CaaSv4, so remove it from Kubernetes 1.17.

The kucero must existed before kubelet serverTLSBootstrap: true, therefore, we revert the code
to configures serverTLSBootstrap: true.

We'll enable the kubelet serverTLSBootstrap: true in the next coming release.

What does this PR do?

Revert the code for enabling kubelet serverTLSBootstrap: true, related to #1152.

Anything else a reviewer needs to know?

It needs to backport to v5 branch release-caasp-5.0.0.

Info for QA

This is info for QA so that they can validate this. This is mandatory if this PR fixes a bug.
If this is a new feature, a good description in "What does this PR do" may be enough.

Related info

Info that can be relevant for QA:

  • link to other PRs that should be merged together
  • link to packages that should be released together
  • upstream issues

Status BEFORE applying the patch

The kubelet configuration /var/lib/kubelet/config.yaml would have serverTLSBootstrap: true in fresh install or upgraded cluster.

Status AFTER applying the patch

The kubelet configuration /var/lib/kubelet/config.yaml would not have serverTLSBootstrap: true in fresh install or upgraded cluster.

Docs

SUSE/doc-caasp#912

Merge restrictions

(Please do not edit this)

We are in v4-maintenance phase, so we will restrict what can be merged to prevent unexpected surprises:

What can be merged (merge criteria):
    2 approvals:
        1 developer: code is fine
        1 QA: QA is fine
    there is a PR for updating documentation (or a statement that this is not needed)

JenTing Hsiao added 2 commits July 13, 2020 11:09
Signed-off-by: JenTing Hsiao <jenting.hsiao@suse.com>
the kucero must deployed before node upgrade due to there
must have a kubelet server CSR signer which is kucero.

however, due to kucero won't no more release on CaaSPv4,
therefore, the kucero won't available on CaaSPv4 official release.

Disable kubelet serverTLSBootstrap: true, we'll enable it in the next
coming CaaSP release.

Signed-off-by: JenTing Hsiao <jenting.hsiao@suse.com>
@jenting jenting self-assigned this Jul 13, 2020
@jenting jenting requested review from c3y1huang, innobead and cclhsu and removed request for c3y1huang July 13, 2020 05:02
@@ -132,7 +132,6 @@ var (
Dex: &AddonVersion{"2.16.0", 6},
Gangway: &AddonVersion{"3.1.0-rev4", 5},
MetricsServer: &AddonVersion{"0.3.6", 1},
Kucero: &AddonVersion{"1.1.1", 0},
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove from Kubernets version 1.17.

@jenting jenting marked this pull request as ready for review July 13, 2020 05:04
Copy link
Contributor

@innobead innobead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@c3y1huang @cclhsu This is the partial revert about kubelet certificate management. You can check the previous merged PR.

cc @maximenoel8

@jenting
Copy link
Author

jenting commented Jul 14, 2020

@evrardjp Could u please take a look, it need to backport to v5 otherwise the upgrade scenario would have metrics-server temporarily in crash loopback state unless the kucero addon installed later in "skuba addon upgrade apply", thx.

Copy link
Contributor

@evrardjp evrardjp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand that we shouldn't have this for 1.17.x, and I indeed agree on the fact we need to remove kucero from 1.17 list, I am not sure to understand what you are saying with serverTLSBootstrap.

If we backport this, serverTLSBootstrap will not be set for 5.0.0, which is fine for upgrades I guess, but I am not sure if it's a good idea for greenfields. Do we intend to have this feature as opt-in ?

What does "We'll enable the kubelet serverTLSBootstrap: true in the next coming release." mean here? Do we have an update/upgrade path?

@jenting
Copy link
Author

jenting commented Jul 14, 2020

If we backport this, serverTLSBootstrap will not be set for 5.0.0, which is fine for upgrades I guess, but I am not sure if it's a good idea for greenfields. Do we intend to have this feature as opt-in ?

What does "We'll enable the kubelet serverTLSBootstrap: true in the next coming release." mean here? Do we have an update/upgrade path?

For greenfield, it's fine to enable the kubelet "serverTLSBootstrap: true" since kucero addon will be deployed when node bootstrap so the kubelet server CSR signer kucero is in cluster.

However, to make it in general behavior for greenfield and brownfield cluster, I'd suggest enable this in the skuba next release (along with kubernetes 1.19).

@evrardjp
Copy link
Contributor

I haven't tested, but it all makes sense. I would love seeing some extra eyes on this.

@dannysauer
Copy link
Contributor

dannysauer commented Jul 14, 2020

For greenfield, it's fine to enable the kubelet "serverTLSBootstrap: true" since kucero addon will be deployed when node bootstrap so the kubelet server CSR signer kucero is in cluster.

I'm not sure if I understand this. It looks like the other parts necessary are in place already; why would we not want to enable TLS boostrapping independent of having kucero manage cert renewal? Seems like that's generally a good idea, and a bootstrap failure due to an expired cluster CA cert would be acceptable (since that cert will need updated anyway due to all the other stuff that will also be failing). Am I missing something important? :)

@jenting
Copy link
Author

jenting commented Jul 15, 2020

For greenfield, it's fine to enable the kubelet "serverTLSBootstrap: true" since kucero addon will be deployed when node bootstrap so the kubelet server CSR signer kucero is in cluster.

I'm not sure if I understand this. It looks like the other parts necessary are in place already; why would we not want to enable TLS boostrapping independent of having kucero manage cert renewal? Seems like that's generally a good idea, and a bootstrap failure due to an expired cluster CA cert would be acceptable (since that cert will need updated anyway due to all the other stuff that will also be failing). Am I missing something important? :)

The problem is when the admin upgrades from 1.17 -> 1.18, the admin would do

  • skuba node upgrade apply (loop all control plane nodes and worker nodes)
  • skuba addon upgrade apply

then at the first control plane node upgrade, usually the admin checks all the pods in the stable state, however, since the first control node sets serverTLSBootstrap: true which means it sends out kubelet server CSR in the cluster but there is no CSR signer right now (kucero installed later in skuba addon upgrade apply).

therefore, the metrics-server would be crash loopback state because the kubelet server does not have it's server certificate right now (it seems like the kubelet does not honor in-disk kubelet.crt which is generated by skuba if serverTLSBootstrap: true.

if the admin is willing to accept the temporarily metrics-server unstable state during cluster upgrade, I'm fine to close this PR; otherwise, this PR should proceed because it solves the kubelet server certificate will expire after 1 year.

Copy link
Contributor

@dannysauer dannysauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation, @jenting. This all looks fine to me based on the other commit then. :)

@jenting jenting merged commit 8439525 into SUSE:master Jul 15, 2020
@jenting jenting deleted the revert-kubelet-serverTLSBootstrap branch July 15, 2020 02:58
innobead pushed a commit that referenced this pull request Jul 16, 2020
when the admin upgrades from 1.17 -> 1.18, the admin would do:
- skuba node upgrade apply (loop all control plane nodes and worker nodes)
- skuba addon upgrade apply

then at the first control plane node upgrade, usually the admin checks all the pods in the stable state, however, since the first control node sets `serverTLSBootstrap: true` which means it sends out kubelet server CSR in the cluster but there is no CSR signer right now (kucero installed later in `skuba addon upgrade apply`).

therefore, the metrics-server would be crash loopback state because the kubelet server does not have it's server certificate right now (it seems like the kubelet does not honor in-disk kubelet.crt which is generated by skuba if `serverTLSBootstrap: true`.

Signed-off-by: JenTing Hsiao <jenting.hsiao@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants