Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloud-init boothook logic broken with cloud-init 24.2 #5115

Open
SriRamanujam opened this issue Sep 5, 2024 · 1 comment · May be fixed by #5116
Open

cloud-init boothook logic broken with cloud-init 24.2 #5115

SriRamanujam opened this issue Sep 5, 2024 · 1 comment · May be fixed by #5116
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@SriRamanujam
Copy link

/kind bug

What steps did you take and what happened:

Ubuntu Noble recently upgraded cloud-init to version 24.2 (prior on 24.1.3). For some reason, this has broken systemctl restart cloud-init, as called by CAPA here. It doesn't actually restart cloud-init anymore, so /etc/secret-userdata.txt never gets picked up, so the Kubernetes components never come up.

What did you expect to happen:

Kubernetes nodes should come up with no problems.

Anything else you would like to add:

I've been digging into this for the past couple of days, here are some miscellaneous notes:

  • cloud-init clean --reboot, run post-hoc, works as expected. So there's nothing wrong with /etc/secret-userdata.txt or the cloud-config.txt itself.
  • I can't see an obvious cloud-init change that would result in this changed behavior. It might be a direct regression upstream, but I'm not familiar enough with cloud-init internals to dig further. Plus, we're not really set up to git bisect cloud-init :(
  • This is a separate issue from Machine with cloud-init 23.3.0 or newer fails to join cluster #4745 - we are already directly patching features.py, which has been sufficient to keep things working up through 24.1.

Environment:

  • Cluster-api-provider-aws version: 2.4.2
  • Kubernetes version: (use kubectl version): 1.27, 1.28, 1.29 (unrelated to kube version)
  • OS (e.g. from /etc/os-release): Ubuntu 24.04 Noble
@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 5, 2024
@faiq faiq linked a pull request Sep 5, 2024 that will close this issue
5 tasks
@dlipovetsky
Copy link
Contributor

/triage accepted
/priority important-soon

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority labels Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants