-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved backup resource selection #15
Comments
Did some brainstorming with @skriss: If you're trying to back up a specific namespace, you probably don't want to include all cluster-scoped resources. A better approach would probably be to specify the namespace and use a label selector. This ensures that the cluster-scoped resources match the selector. Where this doesn't work is with dynamically-provisioned PVs, because they don't assume the labels from the associated PVC. To get around this, I think we should modify the Action() signature to return an optional slice of additional items to back up. In the case of a pod-like resource that references a PVC, we would return that we want to back up the associated PV. We'd need to keep track of a universal list of all resource/name/name items backed up to ensure we don't back up the same PV multiple times. |
@ncdc WDYT about making "cluster" an includeable/excludeable scope, like namespaces? We could change --include-namespaces and --exclude-namespaces to --include/exclude-scopes, and cluster would be a special value. This simplifies excluding cluster-scoped resources (currently you have to either exclude all the resource types or have a label selector that doesn't match them). This would not replace the functionality of backing up the PV's claimed by PVCs -- just seems like a useful feature in thinking through all of the various use cases around resource selection. |
@skriss I've considered finding an easy way to include/exclude cluster-scoped resources. I can't say I love using "scope" as part of the flag name or concept exposed to end users. It would be nice to get some additional feedback on how and what people are backing up and restoring. cc @jbeda @rosskukulinski @djschny |
@ncdc I don't have a ton of signal for you at the moment. Right now, I'm thinking of the backup/restore scenario as two personas:
Haven't thought in depth on this, but I'm liking the namespace + label selector combo. It does require some planning on label definitions, but seems to be the most "kubernetes-native" way to do this. |
We have a use case where we are handling automated backups for specific namespaces (actually archivals and transfers), but we do not have control over what is running in those namespaces and thus there wouldn't be any label we could reliably expect to help us filter out all the cluster resources. Might be a situation that could pop up for other users in situations where backups are handled by a separate group to those using the namespaces to run apps. Would be useful for us to be able to specify just a namespace and omit the cluster objects and at the very least, reduce the storage footprint. (not yet sure the impact having those in there will have, still digging in) |
Some ideas:
|
A bool flag sounds great to me, will hold a little bit and see if I can send a PR for that if no other ideas pop up. |
Another interesting side effect of the cluster objects, if you backup just a namespace with something like: Could we generalize that if the backup specifies namespaces, only snapshot pvc's from those namespaces? |
@ncdc @dgoodwin WDYT about a three-way flag for cluster-scoped resources (this is inline with our
I think this gives the expected behavior by default, with the ability to override it if necessary. I'm happy to implement. |
Sounds ok to me especially if it's matching established precedent. My situation actually boiled down to needing to exclude all but a few cluster level resources, but I think the above would be generally useful just the same. |
SGTM |
@ncdc this will be a breaking change when specifying included namespaces, but I think it's a reasonable break given what it is and where we are - do you agree? |
I think it's reasonable. I also think
This would need to be extensible, pluggable, and configurable, but I think it would be pretty powerful (and fit the OpenShift oauth token use case) |
do you agree that if the flag is set to false, then PV's should not be captured as part of their PVCs? |
Ugh, this is where it gets messy. Maybe it would be easier if we only had 2 modes - you get all cluster-scoped resources based on --include-resources and --exclude-resources, or you only get those associated with what is being backed up? That way, at least you don't get surprised when you say "false" and expect to still have your PVs? Or is that equally confusing? |
I actually think that "false -> no PVs" makes sense and isn't messy: If you're backing up 1 or multiple namespaces:
If you're backing up all namespaces :
|
ok, now that you spelled it out, sgtm |
then over time, per your previous comment, we can improve the behavior of "auto" to be more intelligent about cluster-scoped resource selection |
@skriss do you think we've addressed this and can close it? |
@ncdc yeah I think so. we may make further improvements to the "auto" behavior down the line but we can track separate issues if/when we do that. |
closing. |
…ter-rebase Fixes logger configuration for restore plugins
If you do
ark backup create foo --include-namespaces andy
, that will back up everything in the 'andy' namespace AND all cluster-scoped resources (the following list is from my local v1.7.3, supporting create + list):If you want to back up a subset of resources, you have to specify some combination of --include-resources, --exclude-resources, and/or --selector.
If we decide that one of Ark's core use cases is to allow one or more namespaces to be backed up, we should find a way to limit the resources backed up to just those used by the namespace(s) in question.
The text was updated successfully, but these errors were encountered: