Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert Pod Volume Backup resource/controller to the Kubebuilder V3 framework #4436

Conversation

codegold79
Copy link
Contributor

Summary of change

Convert Pod Volume Backup resource/controller to the Kubebuilder V3 framework. Write Ginkgo tests.

Fixes #3454

Please indicate you've done the following:

@codegold79 codegold79 added the kind/changelog-not-required PR does not require a user changelog. Often for docs, website, or build changes label Dec 9, 2021
@codegold79 codegold79 requested review from zubron, blackpiglet and reasonerjt and removed request for jenting December 9, 2021 01:48
@reasonerjt reasonerjt added this to the v1.8.0 milestone Dec 9, 2021
c.processBackupFunc = c.processBackup

podVolumeBackupInformer.Informer().AddEventHandler(
cache.ResourceEventHandlerFuncs{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In previous version, controller only care about Create and Update event of Pod Volume Backup.
Can the new version controller still keep the event filter function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I don't understand your question, "Can the new version controller still keep the event filter function?"

Did you want me to keep the podVolumeBackupInformer.Informer().AddEventHandler(...) function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Event handler function filters out the Delete event.
But this may be a minor change. Even Delete event enters reconcile loop, it should not do anything harmful.

Copy link
Contributor

@blackpiglet blackpiglet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A common comment is to remove the error log in reconcile function's error handling logic, because a new logger is already added into controller manager to log the return error. You can reference to #4381

@codegold79
Copy link
Contributor Author

codegold79 commented Dec 14, 2021

Based on @blackpiglet's findings in a separate PR, I should look and ensure the new context from kubebuilder is being used and not the old one.

Update: I believe I should follow some of what was done for removing context in this commit: blackpiglet@98444be

@reasonerjt
Copy link
Contributor

As we are very close to FC of v1.8, moving this out of v1.8 milestone.

@reasonerjt reasonerjt modified the milestones: v1.8.0, 1.9.0 Dec 21, 2021
@github-actions github-actions bot added the Dependencies Pull requests that update a dependency file label Jan 12, 2022
@codegold79 codegold79 force-pushed the 3454-convert-pod-volume-backup-controller-to-kubebuilder-v3-framework branch 2 times, most recently from 6a01814 to 6868dba Compare January 15, 2022 00:51
@github-actions github-actions bot removed the Dependencies Pull requests that update a dependency file label Jan 15, 2022
@codegold79 codegold79 force-pushed the 3454-convert-pod-volume-backup-controller-to-kubebuilder-v3-framework branch from 6868dba to 4d2e3bf Compare January 22, 2022 00:57
@codegold79 codegold79 force-pushed the 3454-convert-pod-volume-backup-controller-to-kubebuilder-v3-framework branch 3 times, most recently from 6091e53 to 4d2e3bf Compare January 22, 2022 01:48
@github-actions github-actions bot added the Dependencies Pull requests that update a dependency file label Jan 24, 2022
Signed-off-by: F. Gold <fgold@vmware.com>
Signed-off-by: F. Gold <fgold@vmware.com>
@codegold79 codegold79 force-pushed the 3454-convert-pod-volume-backup-controller-to-kubebuilder-v3-framework branch from 45213bb to 48827d6 Compare January 27, 2022 04:29
@netlify
Copy link

netlify bot commented Jan 27, 2022

✔️ Deploy Preview for fervent-minsky-ed95cc canceled.

🔨 Explore the source changes: ec23f3b

🔍 Inspect the deploy log: https://app.netlify.com/sites/fervent-minsky-ed95cc/deploys/61f4a1b6aea0e50007ba0c17

Signed-off-by: F. Gold <fgold@vmware.com>
Signed-off-by: F. Gold <fgold@vmware.com>
@github-actions github-actions bot removed the Dependencies Pull requests that update a dependency file label Jan 29, 2022
@codegold79
Copy link
Contributor Author

codegold79 commented Jan 29, 2022

Although Ginkgo Basic e2e tests pass (the e2e tests in this PR's checks), when I do a manual test to backup a stateful pod, I get this error:

time="2022-01-29T02:11:45Z" level=error msg="updating PodVolumeBackup resource" backup=velero/j28-c controller=podvolumebackup error="podvolumebackups.velero.io \"j28-c-hkp55\" not found" logSource="pkg/controller/pod_volume_backup_controller.go:104" podvolumebackup=velero/j28-c-hkp55

This is not the first time I've seen this problem. I fixed this error in the past by not using patch helper, but rather r.Client.Update(). However, using r.Client.Update() causes the Ginkgo Basic e2e tests to fail. Because r.Client.Update() caused the Ginkgo Basic e2e tests to fail, I put up this commit. And now the automated tests pass, but manual tests fail.

In one situation (Ginkgo Basic e2e tests), patch helper is able to update PVB status, but unable to update PVB status during manual testing. In another situation (manual backup), r.Client.Update() is able to update PVB status, but unable to during Ginkgo Basic e2e tests. This problem has me extremely stumped.

Despite passing the automated Ginkgo Basic e2e tests, this PR is NOT ready for review due to inability to pass manual e2e testing on a stateful pod.

UPDATE Feb. 7, 2022:
After much searching, I found out why manual tests on my machine are failing. It is because I used Velero client v1.7.1. This version uses an older controller-gen v0.3.0 version than what is used by automated e2e tests. Automated e2e tests use v0.7.0.

My only warning is, when this PR is merged, the client needs to be updated to use the upgraded controller-gen v0.7.0, otherwise backing using Restic will cause Velero backup to hang forever.

But we can't upgrade the client sooner because the older controller-gen v0.3.0 works with the currently main code (PVB not converted to use Kubebuilder). ⚠️ Using a new controller-gen v0.7.0 with old the PVB controller (pre-Kubebuilder framework) will not work. ⚠️

More details are in the Kubernetes Slack channel.

This PR is now ready for review @blackpiglet, @dsu-igeek, and @reasonerjt.

/cc @danfengliu

Signed-off-by: F. Gold <fgold@vmware.com>
@codegold79 codegold79 removed the kind/changelog-not-required PR does not require a user changelog. Often for docs, website, or build changes label Feb 7, 2022
@codegold79 codegold79 requested review from blackpiglet and removed request for zubron February 7, 2022 19:59
c.enqueue(obj)
// PodVolumeBackupReconciler reconciles a PodVolumeBackup object
type PodVolumeBackupReconciler struct {
Scheme *runtime.Scheme
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the Scheme used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Kubebuilder uses it (but I don't know what for) because it automatically adds Scheme as part of scaffolding a basic reconciler struct: https://book.kubebuilder.io/cronjob-tutorial/controller-overview.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert Pod Volume Backup resource/controller to the Kubebuilder framework
5 participants