Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full/absolute path used in log files #257

Open
tleppala opened this issue Mar 7, 2022 · 4 comments
Open

Full/absolute path used in log files #257

tleppala opened this issue Mar 7, 2022 · 4 comments
Labels
enhancement New feature or request

Comments

@tleppala
Copy link
Collaborator

tleppala commented Mar 7, 2022

Full path is used in several files, showing among other things creators username. If data is shared to other people this information is shared with data.

Usage is mostly in log files, so there should not be need for this information, use of relative path to data root more appropriate?

Sample of files where found:

Sample_01-sample1/MC_simulation_01-815/widget_energy_spectrum_1.save
Sample_01-sample1/MC_simulation_01-815/default.log
Sample_01-sample1/Measurement_01-ToF-E-815/tof_in/tof.in_2022-01-26_14.56.09.bak
Sample_01-sample1/Measurement_01-ToF-E-815/tof_in/tof.in_2022-02-02_15.44.01.bak
Sample_01-sample1/Measurement_01-ToF-E-815/tof_in/tof.in_2022-02-28_09.30.23.bak
Sample_01-sample1/Measurement_01-ToF-E-815/tof_in/tof.in_2022-02-28_09.42.42.bak
Sample_01-sample1/Measurement_01-ToF-E-815/tof_in/tof.in_2022-03-03_15.36.28.bak
Sample_01-sample1/Measurement_01-ToF-E-815/tof_in/tof.in_2022-03-03_15.30.38.bak
Sample_01-sample1/Measurement_01-ToF-E-815/tof_in/tof.in_2022-02-28_09.35.35.bak
Sample_01-sample1/Measurement_01-ToF-E-815/tof_in/tof.in_2022-02-28_09.28.08.bak
Sample_01-sample1/Measurement_01-ToF-E-815/tof_in/tof.in_2022-03-07_08.57.02.bak
Sample_01-sample1/Measurement_01-ToF-E-815/tof_in/tof.in_2022-02-28_09.38.57.bak
Sample_01-sample1/Measurement_01-ToF-E-815/default.log
Default/Detector/points.pkl
request.log

@tpitkanen
Copy link
Member

I agree that there are cases of unnecessary absolute path usage in Potku, but it's not very simple or (important) to fix.

There are roughly 4 categories of files that use absolute paths:

  1. log files
  2. C program configuration files (e.g. tof.in, -Default, -Default.erd_detector, ...)
  3. Potku configuration files (I don't remember if Potku saves absolute paths in these files)
  4. UI state saves files

My view on the issue:

  1. no need for absolute paths, but timestamps and the log entries themselves are also kind of private
  2. this is the most problematic category. Potku's C programs use configuration files for just about everything, and if they refer to other files, that's done using absolute paths. I'd guess it's harder to work with relative than absolute paths in C. The programs would still need to be updated to support this.
  3. removing absolute paths would improve portability
  4. removing absolute paths would improve portability

Also, retrieving relative paths from pathlib.Path.resolve()d paths requires extra work. Currently Potku resolves most paths, so this could get annoying.

As far as privacy implications go, this is difficult to fix. Even one absolute path slip-up in the request folder would be enough to leak everything.

@tpitkanen tpitkanen added the enhancement New feature or request label Mar 7, 2022
@jaakkojulin
Copy link
Member

jaakkojulin commented Mar 11, 2022

I have some opinions about these.

For logging purposes one should log the path that is actually used, whether it is relative or absolute. To deal with privacy issues one should (try to) use relative paths when accessing user data (e.g. requests). I agree that slip-ups are inevitable and I think logfiles are inherently private. The solution to privacy issues is to make a feature to export project in some package which does not include logfiles, generated files or other files present in the filesystem but not used by Potku. The remaining files must use relative paths.

When it comes to C programs, there is no hard distinction between absolute and relative paths and I expect most programs to function just as well with relative paths as absolute paths. The programmer has to make a special effort to break compatibility with relative paths.

It would be preferable to run the C programs so that their current working directory is changed to the request path. For example "Efficiency directory" in tof.in file can then be given as a relative path. However the path to distribution data (e.g. masses.dat) might then leak, and it could contain the username. Slip-ups are hard to avoid indeed.

One possible solution to this particular problem (EDIT: of Potku installation directory path leaking) is to install Potku to Program files (Windows), /Applications (macOS), or /usr (Linux). The latest versions of JIBAL library (not yet used by Potku) could also be a part of the solution, the location of distribution data can be "autodetected" if an installer is used or if files are present in predictable locations. This too gets complex very quickly... Right now JIBAL is configured using jibal.conf present in the bin-directory, but there are alternatives to that. There is a larger issue of supporting user-supplied data (stopping, straggling, masses...). Right now this is possible if the user removes jibal.conf supplied by Potku and provides an alternative elsewhere. The alternative can also be created interactively with the help of a bootstrap tool.

@tpitkanen
Copy link
Member

tpitkanen commented Mar 11, 2022

When it comes to C programs, there is no hard distinction between absolute and relative paths and I expect most programs to function just as well with relative paths as absolute paths. The programmer has to make a special effort to break compatibility with relative paths.

Good to know, thanks for the clarification.

One possible solution to this particular problem is to install Potku to Program files (Windows), /Applications (macOS), or /usr (Linux).

This may (or may not be) a good idea in general, but I don't think it fixes this particular issue. It's the request files' location that are leaked, so moving the binaries themselves wouldn't do anything.

(Windows perspective:) If the request files were saved in Program files too, they wouldn't leak usernames anymore, but other problems would arise:

  • Program files is protected, so users would need admin rights to use Potku. I would estimate that a large portion of Potku's user base use their employer's computer without admin rights.
  • Potku would need to be run as admin, which means that any bugs in the program could get more dangerous.
  • Unlike user home folder, Program files is readable by all users (I think). This means that any user on the computer would be able to view and copy the files.

Now that I think about it, Potku is portable and request folders can be saved anywhere already. If a user finds it important to hide their username, they can already do so. Maybe a notice about this in the documentation/program would suffice?

@jaakkojulin
Copy link
Member

This may (or may not be) a good idea in general, but I don't think it fixes this particular issue.

I meant that part as a continuation of the previous point, sorry for the confusion. I meant that operating with relative paths for request data (and logging those) while using absolute paths (to e.g. Program files) for distribution data would be a possible fix.

Requests should of course be stored under the "home" directory from my Mac/Linux perspective. Good point that anonymity may be obtained by changing the request folder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants