Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full CAR verification #4

Open
1 of 7 tasks
guanzo opened this issue Sep 6, 2023 · 1 comment
Open
1 of 7 tasks

Full CAR verification #4

guanzo opened this issue Sep 6, 2023 · 1 comment

Comments

@guanzo
Copy link
Collaborator

guanzo commented Sep 6, 2023

Tracking all cases that need to be implemented in order for CAR verification to be considered "done".

  • incremental verification
    • i.e. "does actual next CAR block == expected next CAR block"
  • generic DAG traversal
  • entity-bytes
  • dag-scope
  • handle CAR dups=n
    • current logic assumes dups=y
  • support dag-json?

Tests:

@guanzo guanzo changed the title Full verification Full CAR verification Sep 6, 2023
@rvagg
Copy link

rvagg commented Sep 6, 2023

A random list of things to consider putting in the mix, up to you how far you go down this rabbit hole (note that this is very complete and some of this is taken for granted in Go code because it doesn't have the dependencies problems that you do in JS--a bunch of stuff comes along for the ride without even asking for it). This list represents the kinds of data you're likely to see at some point. >=95% (likely >=99.9% for the Saturn case) of what you're going to see is UnixFS with SHA2-256; with a split of CIDv1 and CIDv0.

  • Formats (codecs) to test:
    • UnixFS - you have at the base, so 👍
    • dag-pb that's not UnixFS (you can just construct some blocks with bytes and links using @ipld/dag-pb, don't have the unixfs utils make the Data section for you, put something else in there)
    • dag-cbor
    • dag-json
    • dag-jose
  • Hashers
    • sha2-384 and 512 - this is fun because it tests assumptions about sizes
    • sha3
    • blake2b (mainly 256, I don't think I've seen anything other than 256 in the wild, but the @multiformats/blake2 package has all of them)
    • blake3 (maybe, I've not seen it in the wild for ipld, n0 is promoting it, and it came up recently in error on Blake3 CID ipfs/ipld-explorer-components#394)

Filecoin DAGs are interesting to play with and you're likely to see them pass through, they're all dag-cbor/blake2b-256.

For duplicates, the usual approach for testing this is to make a file full of zeros (or some other byte) that's long enough to span multiple blocks and ask the unixfs packer to make it into a DAG. You can end up with a huge file packed in to ~3 blocks and dups=y vs dups=n should give you a different CAR for it.

Also identity CIDs .. we all forget them, but pay attention to them both being in the DAG and also being in the CAR as explicit sections. (i.e. they can be in the DAG in the CAR but not be recorded as blocks in their own sections, or they can be recorded as blocks). We shouldn't allow/expect them to be represented as separate CAR blocks according to spec, but it will not be surprising to see them slip through implementations and have them show up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants