Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transfer manifest mismatch between pds-deep-archive and pds-deep-registry-archive #158

Closed
jordanpadams opened this issue Feb 8, 2024 · 8 comments
Assignees
Labels
B14.1 bug Something isn't working i&t.done s.high

Comments

@jordanpadams
Copy link
Member

jordanpadams commented Feb 8, 2024

Checked for duplicates

Yes - I've already checked

πŸ› Describe the bug

When I generate AIPs with both pds-deep-archive and pds-deep-registry-archive, they are not same.

πŸ•΅οΈ Expected behavior

I expected the files to be the same.

πŸ“œ To Reproduce

$ pds-deep-archive --debug -b https://atmos.nmsu.edu/PDS/data/PDS4/LADEE/ -s PDS_ATM test/data/ladee_test/mission_bundle/LADEE_Bundle_1101.xml
...
 
$ awk '{print length}' ladee_mission_bundle_v1.0_20240208_transfer_manifest_v1.0.tab | uniq -c
  13 511
$ pds-deep-registry-archive -s PDS_GEO urn:nasa:pds:magellan_gxdr::1.0
...

$ awk '{print length}' magellan_gxdr_v1.0_20240208_transfer_manifest_v1.0.tab | uniq -c
 109 512

πŸ–₯ Environment Info

MacOSx
pds-deep-archive.git@ead418a5b2f7ec943cb821dfd2b948fa9db538e7

🩺 Test Data / Additional context

  • deep-registry-archive - urn:nasa:pds:magellan_gxdr::1.0
  • deep-archive - test/data/ladee_test/mission_bundle/

πŸ¦„ Related bugs

Tightly coupled with #155

βš™οΈ Engineering Details

When I update the magellan label from #155, to change record_length to 513 and File Specification Name to 256, it validates successfully, which would indicates the File Specification Name field may be 1 byte too long in the table.

I also tested removing an extra space, and validated successfully:

$ sed -i '' 's/ \r/\r/' magellan_gxdr_v1.0_20240208_transfer_manifest_v1.0.tab

Updated label to have <file_size unit="byte">55808</file_size>

$ validate -t magellan_gxdr_v1.0_20240208_aip_v1.0.xml
...
Summary:

  1 product(s)
  0 error(s)
  0 warning(s)

  Product Validation Summary:
    1          product(s) passed
    0          product(s) failed
    0          product(s) skipped
    1          product(s) total

  Referential Integrity Check Summary:
    0          check(s) passed
    0          check(s) failed
    0          check(s) skipped
    0          check(s) total
@nutjob4life
Copy link
Member

@jordanpadams hate to be a pill about this, but could you include the command-line invocations for both pds-deep-archive and pds-deep-registry-archive? Was hoping to see those under "To Reproduce" πŸ™

@jordanpadams
Copy link
Member Author

@nutjob4life updated

@nutjob4life
Copy link
Member

@jordanpadams fantastic, thanks! Here I thought it was dependent on the --base-url (hence I wanted to see the invocations) but it turns out the issue is just the "column size". This is easy enough.

@tloubrieu-jpl
Copy link
Member

@gxtchen , you can test by downloading the files in this directory https://atmos.nmsu.edu/PDS/data/PDS4/LADEE/mission_bundle/

@gxtchen
Copy link

gxtchen commented Apr 12, 2024

@jordanpadams @tloubrieu-jpl @nutjob4life I tried with v1.1.5 and v1.2.0. I am still seeing the different numbers.
(pds-deeparchive) gxchen@RAYL-C01494 ~/pds/pds4test.build14.1/deep-archive/#158$ awk '{print length}' magellan_gxdr_v1.0_20240412_transfer_manifest_v1.0.tab | uniq -c
109 511
(pds-deeparchive) gxchen@RAYL-C01494 ~/pds/pds4test.build14.1/deep-archive/#158$ awk '{print length}' ladee_mission_bundle_v1.0_20240412_transfer_manifest_v1.0.tab | uniq -c
13 511

@nutjob4life
Copy link
Member

nutjob4life commented Apr 12, 2024

In the output pasted into the comment above, I'm seeing 511 and 511, which is correct. Are some different numbers expected?

Here are my reproduction steps (let me know if I did something wrong 😬):

$ cd /tmp
$ wget \
    --quiet \
    --execute robots=off \
    --cut-dirs=2 \
    --reject='index.html*' \
    --no-host-directories \
    --mirror \
    --no-parent \
    --relative \
    --timestamping \
    --no-check-certificate \
    --recursive \
    https://atmos.nmsu.edu/PDS/data/PDS4/LADEE/mission_bundle/
$ python3.9 -m venv 1.1.5
$ cd 1.1.5
$ bin/pip install --quiet --upgrade pip pds.deeparchive==1.1.5
$ bin/pds-deep-archive --version
pds-deep-archive 1.1.5
$ bin/pds-deep-archive --quiet --bundle-base-url https://atmos.nmsu.edu/PDS/data/PDS4/LADEE/ --site PDS_ATM ../PDS4/LADEE/mission_bundle/LADEE_Bundle_1101.xml
$ awk '{print length}' ladee_mission_bundle_v1.0_*_transfer_manifest_v1.0.tab | uniq -c
  13 511
$ echo "511 columns is correct"
511 columns is correct 
$ bin/pds-deep-registry-archive --version
pds-deep-reigstry-archive 1.1.5
$ bin/pds-deep-registry-archive --quiet --site PDS_GEO urn:nasa:pds:magellan_gxdr::1.0
$ awk '{print length}' magellan_gxdr_v1.0_*_transfer_manifest_v1.0.tab | uniq -c
  109 511
$ echo "511 columns is correct"
511 columns is correct 
$ cd ..
$ git clone https://github.com/NASA-PDS/deep-archive.git 1.2.0
$ python3.9 -m venv 1.2.0
$ cd 1.2.0
$ bin/pip install --quiet --upgrade pip
$ bin/pip install --quiet --editable .
$ bin/pds-deep-archive --version
pds-deep-archive 1.2.0
$ bin/pds-deep-archive --quiet --bundle-base-url https://atmos.nmsu.edu/PDS/data/PDS4/LADEE/ --site PDS_ATM ../PDS4/LADEE/mission_bundle/LADEE_Bundle_1101.xml
$ awk '{print length}' ladee_mission_bundle_v1.0_*_transfer_manifest_v1.0.tab | uniq -c
  13 511
$ echo "511 columns is correct"
511 columns is correct 
$ bin/pds-deep-registry-archive --version
pds-deep-reigstry-archive 1.2.0
$ bin/pds-deep-registry-archive --quiet --site PDS_GEO urn:nasa:pds:magellan_gxdr::1.0
$ awk '{print length}' magellan_gxdr_v1.0_*_transfer_manifest_v1.0.tab | uniq -c
  109 511
$ echo "511 columns is correct"

@gxtchen
Copy link

gxtchen commented Apr 14, 2024

@nutjob4life you are right, I miss read Jordan's original post, he got 511 and 512. All good, thanks.

@nutjob4life
Copy link
Member

@gxtchen whew! Thanks for confirming. Was worried I was losing my mind for a bit there πŸ€ͺ

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B14.1 bug Something isn't working i&t.done s.high
Projects
Status: 🏁 Done
Status: 🏁 Done
Development

No branches or pull requests

4 participants