Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a consistent cache directory #268

Conversation

GenevieveBuckley
Copy link
Collaborator

Closes #264

This PR should make the cache directory handling consistent between the model weights and sample data.
By default, we use:

  • default cache dir + 'micro_sam/models'
  • default cache dir + 'micro_sam/sample_data`

From the pooch.os_cache() docs:

Default cache location based on the operating system.

The folder locations are defined by the platformdirs package using the user_cache_dir function. Usually, the locations will be following (see the platformdirs documentation):

  • Mac: ~/Library/Caches/
  • Unix: ~/.cache/ or the value of the XDG_CACHE_HOME environment variable, if defined.
  • Windows: C:\Users<user>\AppData\Local<AppAuthor><AppName>\Cache

Additionally, users can set an environment variable named MICROSAM_CACHEDIR to choose a different cache directory somewhere else (not the default cache dir + 'micro_sam').

The documentation for the environment variable cache dir control is a bit hidden, currently it is only mentioned:

  • in the module docstring for micro_sam/sample_data.py, and
  • in the docstring for get_sam_model function in micro_sam/util.py.

@GenevieveBuckley
Copy link
Collaborator Author

Questions:

  1. Do I need to run os.path.expanduser on the pooch.os_cache return value / environment variable set by the user? Currently I am not doing this, not sure if there's likely to be a situation where that might cause a problem or not.
  2. Is there anywhere else in the documentation where we need to write about the option to set the os environment variable MICROSAM_CACHEDIR?

Interesting thought: maybe we could even make a napari widget for setting the cache directory path to something other than the default value. Then we could rely less on people reading every single page of the API documentation, and have better discoverability.

Copy link

codecov bot commented Nov 8, 2023

Codecov Report

Merging #268 (83bb7b4) into dev (ec7adad) will increase coverage by 0.77%.
Report is 34 commits behind head on dev.
The diff coverage is 45.19%.

@@            Coverage Diff             @@
##              dev     #268      +/-   ##
==========================================
+ Coverage   38.34%   39.12%   +0.77%     
==========================================
  Files          30       32       +2     
  Lines        3987     4118     +131     
==========================================
+ Hits         1529     1611      +82     
- Misses       2458     2507      +49     
Files Coverage Δ
micro_sam/__init__.py 100.00% <ø> (ø)
micro_sam/training/sam_trainer.py 92.10% <100.00%> (ø)
micro_sam/training/util.py 84.48% <100.00%> (-0.27%) ⬇️
micro_sam/util.py 65.42% <100.00%> (+0.19%) ⬆️
micro_sam/instance_segmentation.py 54.79% <85.71%> (+0.29%) ⬆️
micro_sam/precompute_state.py 22.22% <0.00%> (-0.97%) ⬇️
micro_sam/_vendored.py 67.05% <62.50%> (-1.00%) ⬇️
micro_sam/inference.py 90.00% <90.00%> (ø)
micro_sam/sam_annotator/_state.py 84.61% <84.61%> (ø)
micro_sam/sam_annotator/annotator_2d.py 56.15% <28.57%> (-1.11%) ⬇️
... and 5 more

📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today!

@constantinpape
Copy link
Contributor

  1. Do I need to run os.path.expanduser on the pooch.os_cache return value / environment variable set by the user? Currently I am not doing this, not sure if there's likely to be a situation where that might cause a problem or not.

No I don't think so. For me os_cache returns a user expanded path already:

In [4]: import pooch
In [5]: pooch.os_cache("micro_sam")
Out[5]: PosixPath('/home/pape/.cache/micro_sam')

2. Is there anywhere else in the documentation where we need to write about the option to set the os environment variable MICROSAM_CACHEDIR?

There isn't really a place to put it yet. I have made a note in #249 on this.

Interesting thought: maybe we could even make a napari widget for setting the cache directory path to something other than the default value. Then we could rely less on people reading every single page of the API documentation, and have better discoverability.

I like that idea in principle, but we probably need to think a bit about the implementation and try how well it works in practice. I will make an issue out of this so that we can keep track of it. But I think we should only revisit this once we have made some more advances towards the plugin.

@constantinpape constantinpape merged commit 763a2a3 into computational-cell-analytics:dev Nov 8, 2023
3 checks passed
@GenevieveBuckley
Copy link
Collaborator Author

  1. Do I need to run os.path.expanduser on the pooch.os_cache return value / environment variable set by the user? Currently I am not doing this, not sure if there's likely to be a situation where that might cause a problem or not.

No I don't think so. For me os_cache returns a user expanded path already:

In [4]: import pooch
In [5]: pooch.os_cache("micro_sam")
Out[5]: PosixPath('/home/pape/.cache/micro_sam')

That's the behaviour I see too, but the pooch os_cache documentation does say "User directories ('~') are not expanded" so perhaps this may not always happen (or might not be guaranteed in the future).

Anyway, I don't think this is causing problems, but I could do it if we felt we needed to be super safe.

@constantinpape
Copy link
Contributor

Anyway, I don't think this is causing problems, but I could do it if we felt we needed to be super safe.

I see. Then maybe it's a good idea to expand the path since it should not hurt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants