Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using --files-from and a file with a BOM causes 1st entry to be skipped #1433

Closed
naffit opened this issue Nov 19, 2017 · 2 comments
Closed
Labels
type: feature enhancement improving existing features

Comments

@naffit
Copy link

naffit commented Nov 19, 2017

Output of restic version

restic 0.7.3
compiled with go1.9 on windows/amd64

How did you run restic exactly?

D:\restic>restic_0.7.3_windows_amd64.exe -r test backup --files-from files.txt
D:\restic?abc.txt does not exist, skipping
enter password for repository:
using parent snapshot 5d4da7df
scan [D:\restic\def.txt D:\restic\xyz]
scanned 1 directories, 1 files in 0:00
[0:00] 0B/s 0B / 0B 2 / 2 items 0 errors ETA 0:00
duration: 0:00, 0.00MiB/s
snapshot 0ca43992 saved

D:\restic>dir
Volume in drive D is Data
Volume Serial Number is 5AE5-3725

Directory listing of D:\restic
19-11-2017 21:25 0 abc.txt
19-11-2017 21:25 0 def.txt
19-11-2017 21:30 26 files.txt
20-09-2017 18:53 10.423.808 restic_0.7.3_windows_amd64.exe
17-11-2017 20:19 DIR test
19-11-2017 21:27 DIR xyz

files.txt is a simple UTF-8 encoded textfile with BOM containing:
abc.txt
def.txt
xyz

Saving the file as UTF-8 without BOM makes everything work, but as this makes it harder to edit the file and be sure its saved as UTF-8 (and not just ANSI or whatever notepad or another texteditor on Windows does by default) a BOM is normally used to make sure an editor recognizes the proper file encoding..

What backend/server/service did you use?

None - local repository

Expected behavior

All listed files/paths in the --files-from file is processed regardless of a the presence or absence of a BOM

Actual behavior

The first line of the file is either detected as a nonexisting directory or file

Steps to reproduce the behavior

Please see above

Do you have any idea what may have caused this?

No support for BOM in files?
Perhaps its also an issue for --password-file ? (not tested!)

Do you have an idea how to solve the issue?

Make restic BOM-aware everywhere it reads files.

@fd0
Copy link
Member

fd0 commented Nov 19, 2017

Thanks for the report. What's a BOM? I guess it's the UTF-8 Byte Order Mark? Can you attache a sample file with such a mark so I can have a look?

I can confirm that there's no code in restic which detects or interprets a byte order mark in the first line of the file, it'll be treated just like every other line (i.e. as a filename). On Linux, using UTF-8 is pretty standard.

For --password-file, the contents is treated as-is, without any interpretation. So if a byte order mark is in there, it's passed to the key derivation function directly.

@fd0 fd0 added type: feature enhancement improving existing features state: need feedback waiting for feedback, e.g. from the submitter labels Nov 19, 2017
@naffit
Copy link
Author

naffit commented Nov 19, 2017

Yes, it's a Byte Order Mark (its different from encoding to encoding).

Attached is the files.txt I used for testing with an UTF-8 BOM as created by some random texteditor.
files.txt

Any BOM should be stripped by whatever it opening the file before reading the contents is passed onwards. BOMs are at least 4 bytes long (and goes up towards 10 bytes - https://en.wikipedia.org/wiki/Byte_order_mark), but you'll probably never meet any other than UTF-8, UTF-16 and -32 (in both BE and LE versions) on the OSes that restic runs on.

Regarding the --password-file (and indeed all other files read by restic) and passing the BOM to the key encryption logic I'd hazard a guess and say that its wrong to do so.

I see 2 scenarios where passing the BOM on could cause problems:

  1. Not beeing able to use --password-file to specify a password to an already initialized repository (since the file has 4 or 6 bytes extra data, which causes the password to be wrong) or
  2. If a repository is initialized with a password read via --password-file, then attempting to access the repository by manual key entry gives an error. Given that the password-file might have been lost, this would be quite a showstopper in a recovery scenario where an unbeknownst addition of a BOM causes a paper-backup of the key to be useless..

Otherwise a small entry in the manual saying that all files must be in UTF-8 with or without a BOM and that any other BOM is void should lower the amount of needed changes.. :)

@fd0 fd0 removed the state: need feedback waiting for feedback, e.g. from the submitter label Dec 3, 2017
@naffit naffit changed the title Using --files-from and a file with a BOM causes 1. entry to be skipped Using --files-from and a file with a BOM causes 1st entry to be skipped Apr 8, 2018
@fd0 fd0 closed this as completed in #1748 May 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature enhancement improving existing features
Projects
None yet
Development

No branches or pull requests

2 participants