Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to fix a json string containing parenthesis ( or ) #126

Open
tybalex opened this issue May 7, 2024 · 5 comments
Open

Failing to fix a json string containing parenthesis ( or ) #126

tybalex opened this issue May 7, 2024 · 5 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@tybalex
Copy link

tybalex commented May 7, 2024

The Problem
When I tried to use the library to fix the following string, it failed:
Input:

{"name": "run", "args": {"param1": ["This is C(2)", "This is F(3)]}}

Output:

Error: Object key expected at position 62

========================
However without the '(' or ')' char, it can produce a correct fix:
Input:

{"name": "run", "args": {"param1": ["This is C(2)", "This is F3]}}

Output:

{"name": "run", "args": {"param1": ["This is C(2)", "This is F3"]}}

Is this behavior expected or it is a bug?

@tybalex
Copy link
Author

tybalex commented May 7, 2024

Another example:
Input:

{"name": "run", "args": {"param1": ["This is C(2)", This is F(3)]}}

Output:

{"name": "run", "args": {"param1": ["This is C(2)", 3]}}

@josdejong
Copy link
Owner

Thanks for your input.

Just curious: did you encounter this broken JSON in a real world example, or did you make it up?

The limitation originates in the code that identifies the end of the string when the end quote is missing. It currently stops at the first next delimiter, including ( and ). That is needed to identify MongoDB data types and JSONP notation. To prevent for example This is F(3) to be identified as a MongoDB/JSONP function and replaced with 3, we should refine the logic, for example by checking abcense of spaces in the name, and/or checking against a list with known MongoDB data types.

@josdejong josdejong added bug Something isn't working help wanted Extra attention is needed labels May 8, 2024
@josdejong josdejong reopened this May 8, 2024
@josdejong
Copy link
Owner

Via 58fe64c I've made the detection of MongoDB/JSONP function calls more robust. This solves the issue of jsonrepair silently changing an unquoted string containing parenthesis, the library now consistenly throws an exception.

The fix is not yet published.

Repairing an unquoted string containing parenthesis would be a next step.

@tybalex
Copy link
Author

tybalex commented May 8, 2024

Thanks for your input.

Just curious: did you encounter this broken JSON in a real world example, or did you make it up?

The limitation originates in the code that identifies the end of the string when the end quote is missing. It currently stops at the first next delimiter, including ( and ). That is needed to identify MongoDB data types and JSONP notation. To prevent for example This is F(3) to be identified as a MongoDB/JSONP function and replaced with 3, we should refine the logic, for example by checking abcense of spaces in the name, and/or checking against a list with known MongoDB data types.

Hi @josdejong , thank you for the quick response! I really like this tool.

To answer your question: This is a real world example -- it is produced by an LLM(large language model) we trained, and I was trying to use jsonrepair to fix some of the broken json produced by the LLM.

@josdejong
Copy link
Owner

Thanks, good to know.

@josdejong josdejong added enhancement New feature or request and removed bug Something isn't working labels May 15, 2024
@josdejong josdejong changed the title Error: Object key expected at position: xx when fixing json string contains ( or ) Failing to fix a json string containing parenthesis ( or ) Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants