Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: delete injector pod immediately #2615

Closed
wants to merge 1 commit into from

Conversation

lucasrod16
Copy link
Contributor

Description

This PR adds back the behavior of deleting the injector pod immediately in StopInjectionMadness with no grace period before deletion. This was removed in 2553 but it has lead to failures in our upgrade test because there are times when the injector pod from the initial zarf init in the test still exists in a terminating state, and it fails when we try to create the injector pod on the second "upgrade" zarf init because the pod already exists.

https://github.com/defenseunicorns/zarf/actions/runs/9470798079/job/26092606648?pr=2612#step:10:2776

@lucasrod16 lucasrod16 requested review from dgershman and a team as code owners June 11, 2024 22:07
Copy link

netlify bot commented Jun 11, 2024

Deploy Preview for zarf-docs canceled.

Name Link
🔨 Latest commit 8f4eb13
🔍 Latest deploy log https://app.netlify.com/sites/zarf-docs/deploys/6668ca8b4c260100080fcb12

@lucasrod16 lucasrod16 marked this pull request as draft June 11, 2024 22:22
@lucasrod16
Copy link
Contributor Author

The error is still occurring in CI. Need to look into this more

https://github.com/defenseunicorns/zarf/actions/runs/9473387226/job/26100983827?pr=2615#step:10:2786

@phillebaba
Copy link
Member

So there was a place where grace period actually matters.... 😞 A bit annoying that the fix was applied to all pod deletes forcing us to guess which one needed it.

The other issue looks like it is related to the health checking that was done when port forwarding was done. This e2e test seems to also rely on the fact that it will wait for Pods to be ready.....

@phillebaba
Copy link
Member

The problem is that the tunnel connection doubles as a health check.

https://github.com/defenseunicorns/zarf/blob/83da6d2d0ab3a8e2e4df11dab3dd37097cb90cff/src/pkg/cluster/injector.go#L141-L144

This was fine before because listing the pods in tunnel connect would also wait for the pods to be ready. Now instead we give three tries to connect to the tunnel and thats it. All that needs to be done here is to use the new wait logic before we test to connect to the tunnel.

@lucasrod16
Copy link
Contributor Author

Closing in favor of #2620

@lucasrod16 lucasrod16 closed this Jun 12, 2024
@lucasrod16 lucasrod16 deleted the fix/injector-pod-deletion branch June 12, 2024 20:34
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants