You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, thank you very much for the great work and for providing the datacontract-cli!
While using it, we noticed three topics:
We use the datacontract-cli to check the structure and content of large JSON files. Currently, the fastjsonschema.compile process stops at the first error, which means that if there are multiple errors in the JSON file, the datacontract-cli has to be executed several times until all errors are fixed. We also considered executing the data_contract.test() for each object in the JSON file, but the overhead is too large, resulting in poor performance. Would it be possible to adjust the following call:
Lastly, it would be really helpful for debugging purposes if we could identify which JSON object in an array caused the error. Could the validate_json_stream method be extended so that it identifies the primary key of the related model in the data contract and returns the value from the found attribute along with the error if needed?
Let me know if I can help to create a pull request for those topics!
Best regards
Niklas
The text was updated successfully, but these errors were encountered:
Thanks for your feedback and these improvement ideas.
I understand the idea and it would certainly be doable. I hesitate a little, what happens when you have millions of JSON records that all have the same schema issue. Would you really like to see millions of errors in the response? We would need to implement a max number of errors (like 100?) here.
Fixed and commited to main.
Agree. Not sure about the primary key, but we can think of an index here in the error message.
First of all, thank you very much for the great work and for providing the
datacontract-cli
!While using it, we noticed three topics:
We use the
datacontract-cli
to check the structure and content of large JSON files. Currently, thefastjsonschema.compile
process stops at the first error, which means that if there are multiple errors in the JSON file, thedatacontract-cli
has to be executed several times until all errors are fixed. We also considered executing thedata_contract.test()
for each object in the JSON file, but the overhead is too large, resulting in poor performance. Would it be possible to adjust the following call:so that it works more like the following approach:
Additionally, we noticed that in
data_contract.py
on line 202, the model name is not included in the list of executed checks.Lastly, it would be really helpful for debugging purposes if we could identify which JSON object in an array caused the error. Could the
validate_json_stream
method be extended so that it identifies the primary key of the related model in the data contract and returns the value from the found attribute along with the error if needed?Let me know if I can help to create a pull request for those topics!
Best regards
Niklas
The text was updated successfully, but these errors were encountered: