Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NS Deletion hardening #156

Merged
merged 1 commit into from
Jun 13, 2024
Merged

NS Deletion hardening #156

merged 1 commit into from
Jun 13, 2024

Conversation

philbrookes
Copy link
Collaborator

@philbrookes philbrookes commented Jun 5, 2024

closes: #106

Moved some logic around as we were pulling the managed zone from the API in a few places, instead I've just pulled it in the main reconcile and passed it around. This also meant that I could verify that the MZ and it's secret existed before moving on to setting finalizers and owner references.

In other words, if a DNS Record is created and it's managed zone (or that managed zones secret) has never existed, it will not get the cleanup finalizer, this makes more sense to me, as this record will never have sent anything to a DNS Provider.

I had to modify a test to reflect this more accurately.

New watches:

  • Managed zone events triggers DNS Record reconcile
  • Secret triggers relevant (if any) managed zone reconcile
  • Secret triggers relevant (if any) DNS Record reconcile.

New metric:

  • dns_provider_secret_absent:
    • 1 when absent
    • 0 when present.

internal/controller/dnsrecord_controller.go Outdated Show resolved Hide resolved
internal/controller/dnsrecord_controller.go Outdated Show resolved Hide resolved
internal/controller/dnsrecord_controller.go Outdated Show resolved Hide resolved
@mikenairn
Copy link
Member

Changes here look fine, but does this cover everything that was being asked for from #106?

The change to setting an ownerRef instead of using the watch seems like a good one, but doesn't really change the behaviour in any way, the existing Watch was doing the same thing to ensure the record reconcile was triggered on MZ changes. What about the MZ secrets if they are modified or deleted? Do they not need to be taken into account somewhere?

@philbrookes
Copy link
Collaborator Author

Changes here look fine, but does this cover everything that was being asked for from #106?

The change to setting an ownerRef instead of using the watch seems like a good one, but doesn't really change the behaviour in any way, the existing Watch was doing the same thing to ensure the record reconcile was triggered on MZ changes. What about the MZ secrets if they are modified or deleted? Do they not need to be taken into account somewhere?

hmm, yeah, that seems out of scope for the title of that issue, but it is mentioned in the body alright and I'd overlooked it. I'll make some changes to cover the extra areas.

@philbrookes philbrookes force-pushed the gh-106 branch 3 times, most recently from 58d2e7f to 41ec695 Compare June 10, 2024 12:58
@philbrookes philbrookes force-pushed the gh-106 branch 2 times, most recently from e896ab7 to 5c62c6f Compare June 12, 2024 09:33
internal/controller/dnsrecord_controller.go Outdated Show resolved Hide resolved
internal/controller/managedzone_controller.go Outdated Show resolved Hide resolved
internal/controller/managedzone_controller.go Outdated Show resolved Hide resolved
Copy link
Member

@mikenairn mikenairn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine. Not sure we really need the ownership stuff since it didn't work as required, but probably not doing any harm.

}
err = r.Client.Update(ctx, dnsRecord)
return ctrl.Result{}, err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably not gaining anything by setting this relationship up since it didn't work in the way that we/i hoped (Doesn't trigger reconciles of children when the owner changes).

@philbrookes philbrookes added this pull request to the merge queue Jun 13, 2024
Merged via the queue into Kuadrant:main with commit 67370a6 Jun 13, 2024
11 checks passed
},
}
err = r.Get(ctx, client.ObjectKeyFromObject(managedZone), managedZone, &client.GetOptions{})
if err != nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the managed zone is not found when the dns record is being deleted? The finalizer on the dnsrecord will never be removed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the intention, that means that DNS Records have been created in the DNS Provider and have not been cleaned up.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, you might be right. I know I do not know. But that is causing an issue anyway. Some tests are failing because the dns record is never gone after having deleted it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@philbrookes Is this the intention if a user deletes a ManagedZone first before deleting the DNSRecords ? 🤔 This is how the integration tests are in Kuadrant Operator. In the tests, the ManagedZone has completed their deletion, but DNSRecords remained, and prevented the test NS deletion.

Seems like this introduces a required order of deletion for the user. So maybe the ManagedZone deletion should wait until the assoicated DNSRecords are deleted ? 🤔

If this is the intended behaviour anyway, I've a fix to update the Kuadrant Operator DNSPolicy tests as part of a PR Kuadrant/kuadrant-operator@c70b5e1. Just want to check this is the expected flow being merging

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the intention, if the DNS Record is deleted after the managed zone, the DNS Operator cannot remove the records from the DNS Provider, in order to highlight this to the user/admin the DNS Record never has it's finalizer removed.

See the original issue here: #106

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks, I'll merge the PR containing the changes to the DNS Policy tests to account for this 👍

KevFan added a commit to KevFan/kuadrant-operator that referenced this pull request Jun 13, 2024
…one + timeouts

Since Kuadrant/dns-operator#156 was merged, dnsrecords with finizalers are left behind as ManagedZone was deleted causing the test namespaces to never be fully deleted.

Additionally use SpecContext and add timeouts for context so problematic tests terminate earlier
KevFan added a commit to KevFan/kuadrant-operator that referenced this pull request Jun 13, 2024
…one + timeouts

Since Kuadrant/dns-operator#156 was merged, dnsrecords with finizalers are left behind as ManagedZone was deleted causing the test namespaces to never be fully deleted.

Additionally use SpecContext and add timeouts for context so problematic tests terminate earlier
KevFan added a commit to Kuadrant/kuadrant-operator that referenced this pull request Jun 14, 2024
* make: concurrent tests flags

* refactor: concurrent bare k8s tests

* refactor: concurrent gateway api k8s tests

* refactor: concurrent istio tests

* refactor: test packages for base concurrent integration tests

* refactor: do not wait kuadrant to be ready for integration tests

* fix: sort policy by kind, creation, condition for target status

* refactor: separate out serial tests that can affect other tests

* harden: RLP integration test updates & sort by PolicyByTargetRefKindAndAcceptedStatus tests

* refactor: rename ratelimit test package to ratelimitpolicy

* fix: wait for dnsrecords to finish deleting before deleting managed zone + timeouts

Since Kuadrant/dns-operator#156 was merged, dnsrecords with finizalers are left behind as ManagedZone was deleted causing the test namespaces to never be fully deleted.

Additionally use SpecContext and add timeouts for context so problematic tests terminate earlier
dlaw4608 pushed a commit to dlaw4608/kuadrant-operator that referenced this pull request Jun 17, 2024
* make: concurrent tests flags

* refactor: concurrent bare k8s tests

* refactor: concurrent gateway api k8s tests

* refactor: concurrent istio tests

* refactor: test packages for base concurrent integration tests

* refactor: do not wait kuadrant to be ready for integration tests

* fix: sort policy by kind, creation, condition for target status

* refactor: separate out serial tests that can affect other tests

* harden: RLP integration test updates & sort by PolicyByTargetRefKindAndAcceptedStatus tests

* refactor: rename ratelimit test package to ratelimitpolicy

* fix: wait for dnsrecords to finish deleting before deleting managed zone + timeouts

Since Kuadrant/dns-operator#156 was merged, dnsrecords with finizalers are left behind as ManagedZone was deleted causing the test namespaces to never be fully deleted.

Additionally use SpecContext and add timeouts for context so problematic tests terminate earlier
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

harden DNSRecord clean up when namespaces deleted
4 participants