Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SHA hash should happen before conversion from YAML to JSON #3415

Closed
EricWittmann opened this issue Jun 7, 2023 · 3 comments
Closed

SHA hash should happen before conversion from YAML to JSON #3415

EricWittmann opened this issue Jun 7, 2023 · 3 comments
Labels
area/storage type/bug Something isn't working

Comments

@EricWittmann
Copy link
Member

Description

Currently we hash the content (and references) to get an artifact hash (aka "contentHash"). If YAML content is uploaded, we convert it to JSON before storing it and validating it and everything else. Right now we hash the content after converting it from YAML to JSON. But that's not what we should be doing - we should hash it before the conversion. The point of the hash is to have an ID that can be calculated by the user before registration. For that we need to hash the original content.

Registry Version: 2.4.x
Persistence type: All

@EricWittmann EricWittmann added the type/bug Something isn't working label Jun 7, 2023
@andreaTP
Copy link
Member

andreaTP commented Jun 9, 2023

I do agree with this analysis, but the proposed resolution is probably going to make things more inconsistent.
Let me try to explain:

  • upload some yaml content
  • the hash is calculated on the original content but the stored file is something different
  • it becomes impossible to obtain the content back and verify the content hash against it

Possibly, the only way to achieve the expected semantic is to just handle yaml as a first-class citizen and avoid any manipulation of the user content (translations are almost always lossy on edge cases).

@EricWittmann
Copy link
Member Author

I agree that we should handle YAML as a first-class citizen. That's the ideal solution for sure.

@EricWittmann
Copy link
Member Author

The is done in 3.0 - the content type of the content is now required when creating artifacts and versions. If the content is YAML we now keep it as YAML. Any code that parses the content must obey the content type (parse YAML or JSON accordingly). This only works for OpenAPI and AsyncAPI types, because those are the only two types that commonly support both YAML and JSON formats.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/storage type/bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

2 participants