Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update upload code for changes in API #479

Merged
merged 17 commits into from
Mar 24, 2021
Merged

Update upload code for changes in API #479

merged 17 commits into from
Mar 24, 2021

Conversation

jwodder
Copy link
Member

@jwodder jwodder commented Mar 19, 2021

Closes #478.

TODOs:

@jwodder jwodder added the patch Increment the patch version when merged label Mar 19, 2021
@codecov
Copy link

codecov bot commented Mar 19, 2021

Codecov Report

Merging #479 (bfe168c) into master (b47e06e) will increase coverage by 17.92%.
The diff coverage is 90.99%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #479       +/-   ##
===========================================
+ Coverage   64.83%   82.75%   +17.92%     
===========================================
  Files          59       59               
  Lines        5991     6048       +57     
===========================================
+ Hits         3884     5005     +1121     
+ Misses       2107     1043     -1064     
Flag Coverage Δ
unittests 82.75% <90.99%> (+17.92%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
dandi/upload.py 75.71% <80.00%> (+73.30%) ⬆️
dandi/pynwb_utils.py 85.02% <87.50%> (+3.10%) ⬆️
dandi/dandiapi.py 87.04% <89.47%> (+68.78%) ⬆️
dandi/download.py 86.80% <90.00%> (+7.30%) ⬆️
dandi/core/digests/dandietag.py 96.21% <91.30%> (-1.07%) ⬇️
dandi/support/digests.py 100.00% <100.00%> (ø)
dandi/tests/test_upload.py 92.26% <100.00%> (+72.60%) ⬆️
dandi/models.py 86.23% <0.00%> (+0.28%) ⬆️
dandi/_version.py 45.96% <0.00%> (+1.75%) ⬆️
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b47e06e...bfe168c. Read the comment docs.

Copy link
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An initial quick review: I have not spotted anything "incorrect" but we should switch away from sha256 for upload/download; and fix those uuids in variable names.

FWIW: initial upload attempt didn't succeed yet
*$> DANDI_DEVEL=1 dandi upload -i dandi-api --validation=ignore sub-RAT123/sub-RAT123.nwb
2021-03-19 18:40:38,432 [    INFO] Found 1 files to consider
PATH                      SIZE       ERRORS     UPLOAD STATUS                MESSAGE
sub-RAT123/sub-RAT123.nwb 18.8 kB       1              ERROR                 Error 500 while sending PO...
Summary:                  18.8 kB 1 with errors        1 ERROR               1 Error 500 while sending ...
2021-03-19 18:40:39,719 [    INFO] Logs saved in /home/yoh/.cache/dandi-cli/log/20210319224036Z-7318.log
(dandi-devel) (git)/mnt/backup/dandi/dandisets/000027:[master]
#27 !1158 [0].....................................:Fri Mar 19 18:40:39:.
drogon:/mnt/backup/dandi/dandisets/000027
$> cat /home/yoh/.cache/dandi-cli/log/20210319224036Z-7318.log
2021-03-19T18:40:36-0400 [INFO    ] dandi 7318:140182937339264 [META] sys.argv = ['/home/yoh/miniconda3/envs/dandi-devel/bin/dandi', 'upload', '-i', 'dandi-api', '--validation=ignore', 'sub-RAT123/sub-RAT123.nwb']
2021-03-19T18:40:36-0400 [INFO    ] dandi 7318:140182937339264 [META] os.getcwd() = /mnt/backup/dandi/dandisets/000027
2021-03-19T18:40:38-0400 [INFO    ] dandi 7318:140182937339264 Found 1 files to consider
2021-03-19T18:40:38-0400 [ERROR   ] dandi 7318:140182342420224 Error 404 while sending POST request to https://api.dandiarchive.org/api/blobs/digest/: {"detail":"Not found."}
2021-03-19T18:40:39-0400 [ERROR   ] dandi 7318:140182342420224 Error 500 while sending POST request to https://api.dandiarchive.org/api/uploads/initialize/: 
<!doctype html>
<html lang="en">
<head>
  <title>Server Error (500)</title>
</head>
<body>
  <h1>Server Error (500)</h1><p></p>
</body>
</html>

2021-03-19T18:40:39-0400 [INFO    ] dandi 7318:140182937339264 Logs saved in /home/yoh/.cache/dandi-cli/log/20210319224036Z-7318.log

dandi/upload.py Show resolved Hide resolved
dandi/dandiapi.py Outdated Show resolved Hide resolved
dandi/download.py Outdated Show resolved Hide resolved
@jwodder jwodder changed the title Update upload code for new API Update upload code for changes in API Mar 22, 2021
@jwodder
Copy link
Member Author

jwodder commented Mar 23, 2021

@yarikoptic I added a test that uploads & downloads an empty file, but the upload fails with:

Error 400 while sending POST request to http://localhost:8000/api/uploads/initialize/: {"contentSize":["Ensure this value is greater than or equal to 1."]}

cc: @dchiquito

@yarikoptic
Copy link
Member

ok, filed a dedicated dandi/dandi-archive#168 . So let's skip (comment out) such a test and proceed to length 1 ;)

@jwodder
Copy link
Member Author

jwodder commented Mar 23, 2021

@yarikoptic I'm now manually testing uploading a zero-byte file and a one-byte file to https://gui-beta-dandiarchive-org.netlify.app/#/dandiset/000029. The zero-byte file is failing for the reason above, and the upload of the one-byte file is failing with a 500 server error (with no further details) on the request to /uploads/initialize/.

@dchiquito Looking at the Papertrail logs, it appears that something on the server is trying to use an invalid UUID as a UUID.

@dchiquito
Copy link
Contributor

@jwodder It looks like AWS returned that bizarrely shaped key instead of a UUID. dandi/dandi-archive#170 should fix the issue.

@jwodder
Copy link
Member Author

jwodder commented Mar 23, 2021

@dchiquito Isn't the response to POSTing to the complete_url supposed to include an ETag header which the client is then supposed to compare against its computed ETag for the file? Because that header isn't present when uploading to the new API. (It's there when using minio for the Dockerized tests, though.)

@dchiquito
Copy link
Contributor

@jwodder the complete_url points directly to the objects store, so it's up to them how the response is shaped. The S3 documentation says ETag is included in the response body (as XML 🤮), but not in the headers? I definitely see an ETag header when using Minio, but apparently that's not how S3 behaves. I'm surprised, I thought Minio was aiming for maximum S3 compliance.

@jwodder
Copy link
Member Author

jwodder commented Mar 23, 2021

@dchiquito After updating for that, uploads to the server are now failing with:

Error 400 while sending POST request to https://api.dandiarchive.org/api/uploads/045ddaa2-f932-428e-b8f6-b1c440264b3f/validate/: ["ETag does not match."]

The logs indicate the ETags do match, though. Am I supposed to surround the ETag in double-quotes when submitting it to the server? This error does not occur when testing against the Dockerized environment.

@dchiquito
Copy link
Contributor

@jwodder I agree everything matches. I'm adding some logging, though I suspect you are right and AWS is wrapping it in "" for some reason.

@dchiquito
Copy link
Contributor

@jwodder just validated that upload through the API and it worked. It was indeed S3 returning double quotes. Everything should be hunky dory now.

@yarikoptic
Copy link
Member

woohoo -- green again! @jwodder please "check" the boxes in the original issue description whenever those are addressed/done

@yarikoptic
Copy link
Member

Upload on 3TB file errored out for me with LooseVersion' object has no attribute 'version'
drogon:/tmp/000029
*$> DANDI_DEVEL=1 dandi upload -i dandi-api --allow-any-path largeuploads/blobs/f5d/aac/79d9c6c1ba2439200d45680f1cb941087750a000e484ea2973232709d2
2021-03-23 19:03:10,248 [ WARNING] A newer version (0.12.0) of dandi/dandi-cli is available. You are using 0.11.0+52.g345866d
2021-03-23 19:03:11,770 [    INFO] Found 1 files to consider
PATH                                                                                  SIZE    ERRORS UPLOAD STATUS           MESSAGE
largeuploads/blobs/f5d/aac/79d9c6c1ba2439200d45680f1cb941087750a000e484ea2973232709d2 3.6 TB                ERROR            'LooseVersion' object has no attribute 'version'
Summary:                                                                              3.6 TB                1 ERROR          1 'LooseVersion' object has no attribute 'version'
2021-03-23 23:30:04,598 [    INFO] Logs saved in /home/yoh/.cache/dandi-cli/log/20210323230310Z-12419.log
(dandi-devel)
#52 !1183 [0].....................................:Tue Mar 23 23:30:04:.
drogon:/tmp/000029
$> cat /home/yoh/.cache/dandi-cli/log/20210323230310Z-12419.log
2021-03-23T19:03:10-0400 [INFO    ] dandi 12419:139962955457920 [META] sys.argv = ['/home/yoh/miniconda3/envs/dandi-devel/bin/dandi', 'upload', '-i', 'dandi-api', '--allow-any-path', 'largeuploads/blobs/f5d/aac/79
d9c6c1ba2439200d45680f1cb941087750a000e484ea2973232709d2']
2021-03-23T19:03:10-0400 [INFO    ] dandi 12419:139962955457920 [META] os.getcwd() = /tmp/000029
2021-03-23T19:03:10-0400 [WARNING ] dandi 12419:139962955457920 A newer version (0.12.0) of dandi/dandi-cli is available. You are using 0.11.0+52.g345866d
2021-03-23T19:03:11-0400 [INFO    ] dandi 12419:139962955457920 Found 1 files to consider
2021-03-23T23:30:04-0400 [INFO    ] dandi 12419:139962955457920 Logs saved in /home/yoh/.cache/dandi-cli/log/20210323230310Z-12419.log
(dandi-devel) 
#53 !1184 [0].....................................:Wed Mar 24 08:29:45:.
drogon:/tmp/000029
$> dandi --version
0.11.0+52.g345866d

@jwodder
Copy link
Member Author

jwodder commented Mar 24, 2021

@yarikoptic Could you rerun with --devel-debug? I only see two locations in dandi that use LooseVersion, and neither accesses a version attribute.

@yarikoptic
Copy link
Member

yarikoptic commented Mar 24, 2021

--devel-debug
*$> DANDI_DEVEL=1 dandi upload -i dandi-api --allow-any-path --devel-debug largeuploads/blobs/f5d/aac/79d9c6c1ba2439200d45680f1cb941087750a000e484ea2973232709d2 
2021-03-24 08:48:26,249 [ WARNING] A newer version (0.12.0) of dandi/dandi-cli is available. You are using 0.11.0+52.g345866d
2021-03-24 08:48:27,948 [    INFO] Found 1 files to consider
{'size': 3628888939868}
{'status': 'digesting'}
{'status': 'pre-validating'}
2021-03-24 08:48:28,326 [    INFO] Logs saved in /home/yoh/.cache/dandi-cli/log/20210324124825Z-15856.log
Traceback (most recent call last):
  File "/home/yoh/miniconda3/envs/dandi-devel/bin/dandi", line 33, in <module>
    sys.exit(load_entry_point('dandi', 'console_scripts', 'dandi')())
  File "/home/yoh/miniconda3/envs/dandi-devel/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/yoh/miniconda3/envs/dandi-devel/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/yoh/miniconda3/envs/dandi-devel/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/yoh/miniconda3/envs/dandi-devel/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/yoh/miniconda3/envs/dandi-devel/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/yoh/miniconda3/envs/dandi-devel/lib/python3.8/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context().obj, *args, **kwargs)
  File "/home/yoh/proj/dandi/dandi-cli/dandi/cli/base.py", line 109, in wrapper
    return f(*args, **kwargs)
  File "/home/yoh/proj/dandi/dandi-cli/dandi/cli/cmd_upload.py", line 98, in upload
    upload(
  File "/home/yoh/proj/dandi/dandi-cli/dandi/upload.py", line 49, in upload
    return _new_upload(
  File "/home/yoh/proj/dandi/dandi-cli/dandi/upload.py", line 897, in _new_upload
    for v in process_path(path, relpath):
  File "/home/yoh/proj/dandi/dandi-cli/dandi/upload.py", line 775, in process_path
    validation_errors = validate_file(path)
  File "/home/yoh/proj/dandi/dandi-cli/dandi/validate.py", line 41, in validate_file
    return pynwb_validate(filepath, devel_debug=devel_debug) + validate_dandi_nwb(
  File "/home/yoh/miniconda3/envs/dandi-devel/lib/python3.8/site-packages/fscacher/cache.py", line 116, in fingerprinter
    ret = fingerprinted(path, *args, **kwargs_)
  File "/home/yoh/miniconda3/envs/dandi-devel/lib/python3.8/site-packages/joblib/memory.py", line 591, in __call__
    return self._cached_call(args, kwargs)[0]
  File "/home/yoh/miniconda3/envs/dandi-devel/lib/python3.8/site-packages/joblib/memory.py", line 534, in _cached_call
    out, metadata = self.call(*args, **kwargs)
  File "/home/yoh/miniconda3/envs/dandi-devel/lib/python3.8/site-packages/joblib/memory.py", line 761, in call
    output = self.func(*args, **kwargs)
  File "/home/yoh/miniconda3/envs/dandi-devel/lib/python3.8/site-packages/fscacher/cache.py", line 84, in fingerprinted
    return f(path, *args, **kwargs)
  File "/home/yoh/proj/dandi/dandi-cli/dandi/pynwb_utils.py", line 268, in validate
    if loosever and loosever < "2.1.0":
  File "/home/yoh/miniconda3/envs/dandi-devel/lib/python3.8/distutils/version.py", line 52, in __lt__
    c = self._cmp(other)
  File "/home/yoh/miniconda3/envs/dandi-devel/lib/python3.8/distutils/version.py", line 335, in _cmp
    if self.version == other.version:
AttributeError: 'LooseVersion' object has no attribute 'version'
which is odd ... fscacher 0.1.0 and joblib 1.0.1

@yarikoptic
Copy link
Member

yarikoptic commented Mar 24, 2021

damn - I think I was running an outdated version of this PR (and I thought I had checked)... will hard reset and redo now

edit: so we again would need to wait for re-digetsion, but at least it succeeded on some random tiny file so I know that now using correct state of the PR

edit2: I wish I we had not postponed finishin up/merging #465 -- now have no clue how long redigestion would take

@jwodder
Copy link
Member Author

jwodder commented Mar 24, 2021

@yarikoptic I see the problem. get_nwb_version() is returning None for the file, leading to an invalid LooseVersion instance.

EDIT: I've pushed a patch.

@yarikoptic
Copy link
Member

yarikoptic commented Mar 24, 2021

THANKS! I will interrupt, update, redo

edit: also FWIW tested on some other non nwb h5 file /usr/lib/python3/dist-packages/tables/tests/scalar.h5 just in case

@yarikoptic
Copy link
Member

Since CI is happy and basic testing is happy and that huge upload would take awhile, let's take this PR from draft and merge? This should make CI happy for PRs so we do miss us breaking anything unrelated

@jwodder jwodder marked this pull request as ready for review March 24, 2021 16:55
@yarikoptic yarikoptic merged commit 089101e into master Mar 24, 2021
@yarikoptic yarikoptic deleted the gh-478 branch March 24, 2021 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
patch Increment the patch version when merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RF/BF: adjust dandi-api upload for changes in the API
4 participants