Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timed out after 180.001s. waiting for cluster deletion timed out #11162

Open
adilGhaffarDev opened this issue Sep 10, 2024 · 6 comments
Open
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/flake Categorizes issue or PR as related to a flaky test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@adilGhaffarDev
Copy link
Contributor

adilGhaffarDev commented Sep 10, 2024

Which jobs are flaking?

  • periodic-cluster-api-e2e-mink8s-release-1-8
  • periodic-cluster-api-e2e-main
  • periodic-cluster-api-e2e-latestk8s-main
  • periodic-cluster-api-e2e-release-1-8

Which tests are flaking?

Since when has it been flaking?

there were few before, more flakes after 6-9-2024

Testgrid link

https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/periodic-cluster-api-e2e-release-1-8/1833344003316125696

https://storage.googleapis.com/k8s-triage/index.html?job=.*-cluster-api-.*&xjob=.*-provider-.*%7C.*-operator-.*#8899ccb732f9f0e048cb

Reason for failure (if possible)

MachinePool deletion is stuck

Anything else we need to know?

No response

Label(s) to be applied

/kind flake
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

@k8s-ci-robot k8s-ci-robot added kind/flake Categorizes issue or PR as related to a flaky test. needs-priority Indicates an issue lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 10, 2024
@adilGhaffarDev
Copy link
Contributor Author

cc @kubernetes-sigs/cluster-api-release-team

@sbueringer
Copy link
Member

I took an initial look and the cluster deletion is stuck because MachinePools are stuck in deletion

cc @jackfrancis @willie-yao @Jont828
(feel free to cc other folks that are interested in help maintaining MachinePools)

@jackfrancis
Copy link
Contributor

@sbueringer thx for triaging

@fabriziopandini fabriziopandini added help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Sep 11, 2024
@k8s-ci-robot k8s-ci-robot removed needs-priority Indicates an issue lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 11, 2024
@sbueringer
Copy link
Member

sbueringer commented Sep 12, 2024

Given how often this test fails and how long it has been failing, I think we should consider removing machinepools from the affected tests.

There's a high chance we are missing other issues in the rest of Cluster API because of this flake.

@willie-yao
Copy link
Contributor

I can take a look at this if there are no other more urgent MP-related issues @sbueringer

@sbueringer
Copy link
Member

sbueringer commented Sep 17, 2024

I can't prioritize MP issues for MP maintainers. But from a CI stability perspective this one is really important.

I think either we can fix it soon, or we have to disable MPs for all affected tests. I just don't want to take the risk for much longer that this flake is hiding other issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/flake Categorizes issue or PR as related to a flaky test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: No status
Development

No branches or pull requests

6 participants