Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore records with IDENTITY CID in IndexSorted #218

Merged
merged 1 commit into from
Sep 8, 2021

Conversation

masih
Copy link
Member

@masih masih commented Sep 2, 2021

Skip IDENTITY CIDs in IndexSorted, as required by CARv2
specification.

Re-generate sample-wrapped-v2.car testdata, since its previous index
contained CIDs with IDENTITY multihash.

Gracefully handle CIDs with IDENTITY multihash code in the
blockstore API. Since Get, GetSize and Has interfaces rely on
the index, decode given keys directly to satisfy calls for such CIDs.

See:

Fixes #215

@masih masih marked this pull request as ready for review September 2, 2021 13:16
@willscott
Copy link
Member

I'm not totally opposed to this as a defensive mechanism.

in the past we've used the pattern of wrapping the blockstore in an idstore https://github.com/ipfs/go-ipfs-blockstore/blob/master/idstore.go when there is the potential for interaction with identity-hashed blocks.
I wonder if we're okay at this layer just panic-ing / being unhappy if they show up and expecting that wrapping layer to take care of them?

@masih
Copy link
Member Author

masih commented Sep 2, 2021

I wonder if we're okay at this layer just panic-ing / being unhappy if they show up and expecting that wrapping layer to take care of them?

Yup that makes sense; will do 🍻

@rvagg
Copy link
Member

rvagg commented Sep 3, 2021

I'm a soft +1 on all our layers behaving like this, but it's a bit tricky since if we don't do it completely then we set up a false expectation for users that they don't need to wrap or be careful with the layers they're interacting with. I don't think there's harm in doing it here, it's quite nice, especially if you just want to use a CARv2 as a raw blockstore, but the user expectation problem will be the thing that gets us, if anything.

@masih
Copy link
Member Author

masih commented Sep 3, 2021

I'm a soft +1 on all our layers behaving like this, but it's a bit tricky since if we don't do it completely then we set up a false expectation for users that they don't need to wrap or be careful with the layers they're interacting with. I don't think there's harm in doing it here, it's quite nice, especially if you just want to use a CARv2 as a raw blockstore, but the user expectation problem will be the thing that gets us, if anything.

Having implemented an explicit error when given IDENTITY CIDs in blockstore here, I am leaning towards being more forgiving in the CARv2 blockstore APIs and "doing the right thing" when given such CIDs.

This is because, in order to explicitly error out the blockstore needs to decode the CID hash, and right there we have all the information to do what IdStore does. Sure; it would be duplicating the logic of IdStore in CARv2 library, but not doing it and forcing IdStore wrapper means:

  1. potential code changes for dependant work streams if CAR files interacted with via the blockstore API often contain IDENTITY CIDs (cc @aarshkshah1992)
  2. the CIDs will get decoded twice: once by the wrapping IdStore and once by the blockstore API here to check that they are indeed not IDENTITY CIDs.

@willscott @rvagg @aarshkshah1992 What are your thoughts?

@rvagg
Copy link
Member

rvagg commented Sep 6, 2021

I think my weakly held position on this is that CARv2 is somewhat likely to be used as a stand-alone blockstore, for IPLD applications that exist outside of IPFS and want a simple immutable block store, so if the information is at hand already (i.e. it's cheap), I think it should transparently handle identity CIDs.

@masih
Copy link
Member Author

masih commented Sep 7, 2021

After much discussion, this PR now gracefully handles IDENTITY CIDs, doing what IdStore wrapper does without the need for wrapping.

Thanks @willscott @rvagg

Copy link
Contributor

@mvdan mvdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this behavior be mentioned in the blockstore package godoc? It seems pretty relevant at a high level.

v2/blockstore/readonly.go Outdated Show resolved Hide resolved
v2/blockstore/readonly.go Outdated Show resolved Hide resolved
@masih masih force-pushed the masih/idx-srtd-ignore-id branch 2 times, most recently from 8b7eb8b to c9d8b61 Compare September 8, 2021 09:25
v2/blockstore/doc.go Outdated Show resolved Hide resolved
v2/blockstore/readonly.go Outdated Show resolved Hide resolved
Skip `IDENTITY` CIDs in `IndexSorted`, as required by CARv2
specification.

Re-generate `sample-wrapped-v2.car` testdata, since its previous index
contained CIDs with `IDENTITY` multihash.

Gracefully handle CIDs with `IDENTITY` multihash code in the
`blockstore` API. Since `Get`, `GetSize` and `Has` interfaces rely on
the index, decode given keys directly to satisfy calls for such CIDs.

See:
- https://ipld.io/specs/transport/car/carv2/#index-format

Fixes #215
@masih masih merged commit acd3ead into master Sep 8, 2021
@masih masih deleted the masih/idx-srtd-ignore-id branch September 8, 2021 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IndexSorted should not load CIDs with multihash code of IDENTITY
4 participants