Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

purge-vault filtering doesn't work as expected #106

Open
rayrapetyan opened this issue Jan 25, 2015 · 7 comments
Open

purge-vault filtering doesn't work as expected #106

rayrapetyan opened this issue Jan 25, 2015 · 7 comments

Comments

@rayrapetyan
Copy link

Supposedly I'm doing something wrong, but at least things are not working as expected...
I've uploaded few individual files into the vault like that:

./glacier/mt-aws/mtglacier upload-file --config ./glacier/mt-aws/glacier.cfg --vault qqq --journal ./glacier/mt-aws/qqq.journal --partsize 256 --filename ./foo.txt --set-rel-filename foo.txt

./glacier/mt-aws/mtglacier upload-file --config ./glacier/mt-aws/glacier.cfg --vault qqq --journal ./glacier/mt-aws/qqq.journal --partsize 256 --filename ./test.txt --set-rel-filename test.txt

Now I want to delete 'test.txt' from the vault. The problem is: whatever I put into "--filter" - it always results in deleting ALL files in the vault:

./glacier/mt-aws/mtglacier purge-vault --config ./glacier/mt-aws/glacier.cfg --vault qqq --journal ./glacier/mt-aws/qqq.journal --filter '+test.txt'

PID 57899 Started worker
PID 57900 Started worker
PID 57901 Started worker
PID 57902 Started worker
PID 57901 Deleted test.txt archive_id [mmAc8t54R3GRQOe5L5nxCFmZlVqjdVsPVGMjua63zgn0siJqrRZ1YDZB1GGxYCskLaMDsdwc5E6fswDQ-XBUZaUGp7eqpfw7jJOpQaTn2yvDU-zCo2IPilr0Ow180t9PnfMnGdC4pA]
PID 57899 Deleted foo.txt archive_id [s_bgL356OdJJgCQ8b2Dfsrqiu09RLpeLfxtTaqIaSbrm-mZBFhkZFRit2OiO5oVmRz6d7gcRIjwyLUVUQi5AsvWWFp93BGNHdA_ShyKsyR9AeawzJBm9ySSM5iEt-8PnDkQx-71Ewg]
OK DONE

I've already lost one of my vaults completely lol, and still have no idea how to delete a single file...

@EQXTFL
Copy link

EQXTFL commented Jan 25, 2015

I use the following options, after removing the file locally and from the journal:
sync --delete-removed --config=glacier.conf --vault=Vault --journal=journal.log --dir=/path/to/dir

@vsespb
Copy link
Owner

vsespb commented Jan 25, 2015

Hello. See docs: 2) If no rules matched - file is included (default rule is INCLUDE rule).

@vsespb
Copy link
Owner

vsespb commented Jan 25, 2015

I've already lost one of my vaults completely lol

That's sad. But when making backups opposite can happen, one can include and exclude several files, and default rule will work for others. If default rule would be EXCLUDE, other files won't be backed up. So at some point i've choosen to make default rule INCLUDE.

@vsespb
Copy link
Owner

vsespb commented Jan 25, 2015

also, see dry-run - good for testing.

@rayrapetyan
Copy link
Author

Although tool is primarily designed for syncing local dir with a vault, I'm always using it for uploading individual archives from different dirs so "sync" will not work for me...

IMHO INCLUDE rule is much more dangerous then EXCLUDE, user can fix "files not backed up" issue any moment, while "all files in vault has been deleted" issue is unrecoverable.

I think base form of "delete" operation must be as simple as with local file-system, something like
delete-file filename_pattern, without +/- and other stuff..

And btw - how to delete a single file by name, if default rule is INCLUDE everything?

@vsespb
Copy link
Owner

vsespb commented Jan 25, 2015

And btw - how to delete a single file by name, if default rule is INCLUDE everything?

9) if PATTERN is empty, it matches anything.

3. PATTERN can be empty (Example:--filter +data/ --filter -- excludes everything except any directory with namedata, last pattern is empty)

In your case --filter '+test.txt -'

I think base form of "delete" operation must be as simple as with local file-system, something like delete-file filename_pattern, without +/- and other stuff..

Yes, there should be commands to work with single file, upload-file delete-file etc. Currently not implemented. Things like sync designed to work with multiple files.

IMHO INCLUDE rule is much more dangerous then EXCLUDE, user can fix "files not backed up" issue any moment, while "all files in vault has been deleted" issue is unrecoverable

That's questionable. One can say that usually tool is used for backup, and operation is automated. And deletion usually not automated, but performed manually (well, except when rotation implemented, then deletion is automated too).

Then "all files in vault has been deleted" can be easy fixed by reuploading files. Because that's just a backup, not original copy.

And missing files when doing automated backup is usually unnoticable thus more danger.

Anyway, when I designed this I analyzed several tools which work with group of files and its filtering options and come to conclusion that default INCLUDE is better, maybe I don't remember now all details which lead to this decision,

For example duplicity/rsync default is INCLUDE too.

http://duplicity.nongnu.org/duplicity.1.html

Each file selection condition either matches or doesn’t match a given file. A given file is excluded by the file selection system exactly when the first matching file selection condition specifies that the file be excluded; otherwise the file is included. 

http://linux.die.net/man/1/rsync

As the list of files/directories to transfer is built, rsync checks each name to be transferred against the list of include/exclude patterns in turn, and the first matching pattern is acted on: if it is an exclude pattern, then that file is skipped; if it is an include pattern then that filename is not skipped; if no matching pattern is found, then the filename is not skipped. 

Also note that I will be unable to change this without breaking backward compatibility.

@rayrapetyan
Copy link
Author

Agree with everything, except how it is covered in docs:

  1. If no rules matched - file is included (default rule is INCLUDE rule). (this line is OK, although I would put it at the top of "--filter" option description section).

--filter '+*.jpeg' File file.txt is INCLUDED, as it does not match any rules

That's what lead me to make a mistake. Formally the statement is right, when you also know however that expression --filter '+*.jpeg' has no any effect at all. Instead it looks like someone want to do something with jpeg files...

I would replace this part in docs with group of examples (something like that):

--filter '+' All files will be deleted (default case as when "--filter" param is not specified at all)
--filter '-*.jpeg' All files except *.jpeg will be deleted
--filter '+test.txt -' Delete only test.txt file

UPD: just realized that "--filter" part in docs relate to all commands, not just deletion... Anyway, I think it's worth to create "Delete files" section and put details there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants