-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add resource limits to restic init container #1677
Conversation
pkg/restore/restic_restore_action.go
Outdated
@@ -107,6 +108,12 @@ func (a *ResticRestoreAction) Execute(input *velero.RestoreItemActionExecuteInpu | |||
}, | |||
}, | |||
}, | |||
Resources: corev1.ResourceRequirements{ | |||
Limits: corev1.ResourceList{ | |||
corev1.ResourceCPU: resource.MustParse("100m"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider making these optionally configurable via the existing ConfigMap for this plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about that, but seeing how valid quantities in the wrong fields can break with #1678 gives me pause on the failure mode.
For example, if we allow the user to specify the limits and/or requests and they put cpu: 128Mi
, it's a valid quantity string, but not for CPU. MustParse
and ParseQuantity
only seem to care about the string. So far I don't know that there's a way to find out about that until the pod fails to get scheduled, which I think would cause the Velero Restore to get stuck InProgress.
I'll do some more digging in the Kubernetes code to see if there's a helper function that we may have missed.
Added 2 test cases and the necessary helpers for creating initContainers. Let me know if this approach makes sense. |
Approach LGTM. Looking back at the original issue, looks like we need to add requests and limits (see the error message). Is the |
I need to actually run this against a cluster to be sure, was mostly making sure I got compilation working first. Do you have any more thoughts about the failure mode when allowing users to configure the values? |
Let me think about this a little more. |
Per other PR, I think if we (a) validate that the request <= the limit for both CPU and mem, and (b) put some generous upper bounds on each (e.g. 1 CPU, 128Mi mem?), then we can allow user-configuration. If their values don't conform to these validations, we can log an error and proceed with our own defaults |
lgtm so far 👍 |
@nrb should I wait for this before adding flags to install for restic requests/limits (#94), since I think you wanted to move around the |
@prydonius That'll work - I'll get this out of draft today. |
Fixes vmware-tanzu#1201 Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one nit but otherwise LGTM. Have you been able to test manually?
@skriss So I'm trying to test it, but I think We're not adding the snapshot annotations to the pod anymore, but the restore action code still aborts if it's not present. (https://github.com/heptio/velero/blob/master/pkg/restore/restic_restore_action.go#L71-L72). The pod comes back unaltered. |
We should probably document the additional ConfigMap fields here. |
EDIT: sorry, my restic daemonset wasn't running. Running it fixes this. I think this is currently still broken; the init container never finishes, as it doesn't find the Velero directories.
I don't believe my changes caused this, but perhaps the change to the builders may have left out some info. |
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
With the latest code, I'm able to restore into a namespace with a quota.
|
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
@skriss Ok, all documentation should be updated now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
yep - merging |
Fixes #1201
Signed-off-by: Nolan Brubaker brubakern@vmware.com