Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mint Zarr assets at start of upload and sync Zarr contents #907

Merged
merged 6 commits into from
Feb 16, 2022
Merged

Conversation

jwodder
Copy link
Member

@jwodder jwodder commented Feb 10, 2022

Closes #884.
Closes #906.

@jwodder jwodder added patch Increment the patch version when merged cmd-upload zarr labels Feb 10, 2022
@codecov
Copy link

codecov bot commented Feb 10, 2022

Codecov Report

Merging #907 (f1b372f) into master (0e6c5ef) will increase coverage by 0.58%.
The diff coverage is 96.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #907      +/-   ##
==========================================
+ Coverage   86.79%   87.38%   +0.58%     
==========================================
  Files          61       61              
  Lines        7424     7453      +29     
==========================================
+ Hits         6444     6513      +69     
+ Misses        980      940      -40     
Flag Coverage Δ
unittests 87.38% <96.66%> (+0.58%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
dandi/dandiapi.py 88.90% <93.10%> (-0.80%) ⬇️
dandi/files.py 82.53% <96.42%> (+2.70%) ⬆️
dandi/consts.py 100.00% <100.00%> (ø)
dandi/tests/test_upload.py 100.00% <100.00%> (ø)
dandi/tests/skip.py 89.06% <0.00%> (-0.17%) ⬇️
dandi/support/pyout.py 90.00% <0.00%> (-0.17%) ⬇️
dandi/organize.py 82.19% <0.00%> (-0.05%) ⬇️
dandi/cli/tests/test_ls.py 100.00% <0.00%> (ø)
dandi/cli/tests/test_download.py 100.00% <0.00%> (ø)
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0e6c5ef...f1b372f. Read the comment docs.

Copy link
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. seems in need of test tune up to trigger "delete" logic coverage

dandi/files.py Show resolved Hide resolved
@jwodder
Copy link
Member Author

jwodder commented Feb 11, 2022

While testing, I hit a 409 error when trying to upload a file to a path that was previously a directory, even after deleting all of the directory's files:

<Error><Code>XMinioObjectExistsAsDirectory</Code><Message>Object name already exists as a directory.</Message><Key>zarr/b1d62558-c3e9-45c8-a395-0a92a8f61c46/changed-type</Key><BucketName>dandi-dandisets</BucketName><Resource>/dandi-dandisets/zarr/b1d62558-c3e9-45c8-a395-0a92a8f61c46/changed-type</Resource><RequestId>16D2C651D13510B0</RequestId><HostId>0e495f50-1b7d-4ffb-bb32-90394f235a56</HostId></Error>

@dchiquito What is the proper way to delete a directory inside a Zarr on the server?

EDIT: Never mind, my mistake, though I'd still be interested in if there's a better way to delete a directory than deleting all its contents individually.

@jwodder jwodder marked this pull request as draft February 11, 2022 16:19
@jwodder jwodder marked this pull request as ready for review February 11, 2022 16:55
@satra
Copy link
Member

satra commented Feb 11, 2022

I'd still be interested in if there's a better way to delete a directory than deleting all its contents individually.

the only thing i can think of is a dandi api request that launches a task that internally lists all objects with a given prefix, and then deletes all of them. this can take a while though for zarr trees and we have to be careful on what that prefix can be for a request.

@@ -1707,6 +1711,8 @@ def get_digest(self) -> Digest:

:raises NotFoundError: if the path does not exist in the Zarr asset
"""
if self._digest is not None:
return self._digest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the "life cycle" of this ._digest? :

  • I do not see it just being a lazy one (not assigned below in this function)
  • digests of zarr could change on remote/server since zarr files are mutable, so aren't we doomed to query it every time .digest is requested?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, the latter is a wrong question since we are talking here about RemoteZarrEntry not an entire Zarr. But still not certain about what should set ._digest?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_digest is set by _get_subpath() when called by iterdir(), in which case the code is going through a Zarr listing with checksums, so storing the entry's checksum saves a potential HTTP request later. _digest can be unset by the cache_clear() method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so is it the get_listing below which would set it? this is all makes it not obvious, and worth at least a comment here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added.

dandi/files.py Outdated Show resolved Hide resolved
@yarikoptic
Copy link
Member

@jwodder - please resolve freshly came up conflict and let's merge to carry forward

@jwodder
Copy link
Member Author

jwodder commented Feb 16, 2022

@yarikoptic Conflict resolved.

@yarikoptic
Copy link
Member

ok, still green, let's proceed. thanks @jwodder !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cmd-upload patch Increment the patch version when merged zarr
Projects
None yet
Development

Successfully merging this pull request may close these issues.

provide "update zarr" functionality Modify Zarr assets in-place when uploading to an extant Zarr path
3 participants