Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON generated with invalid escape characters #759

Closed
pmbaumgartner opened this issue Mar 19, 2024 · 14 comments · Fixed by #829
Closed

JSON generated with invalid escape characters #759

pmbaumgartner opened this issue Mar 19, 2024 · 14 comments · Fixed by #829
Labels
bug JSON structured generation Linked to structured generation

Comments

@pmbaumgartner
Copy link

pmbaumgartner commented Mar 19, 2024

Describe the issue as clearly as possible:

Occasionally when I use outlines it will return a JSON string with invalid JSON. This happens most often when it generates an invalid escape character.

This is fairly hard to replicate because the frequency of this issue depends on the model and the prompt. The example code I have below generated this error on the 3rd iteration of the loop when I ran it, but now I'm trying to replicate it again and can't get it to happen.

I monkey-patched models/llamacpp.py to print out the offending string when there's a parse error. Here is an example of JSON that fails to parse:

{"answer":[{"statement":"Privacy preferences allow limiting sharing of creditworthiness data with other banks, insurance companies, and service providers.","reason":"The context does not mention anything about privacy preferences, but it does say you cannot limit sharing with other banks/insurance companies/service providers so that you won\\\'t get offers based on the data shared by the bank, but you can limit sharing with them","verdict":1},{"statement":"You cannot limit access to credit reports themselves.","reason":"The context explicitly states: \\"You cannot limit the credit reports themselves\\", which confirms this statement","verdict":1},{"statement":"Check credit reports on websites like annualcreditreport.com.","reason":"The context mentions using \\"http://www.annualcreditreport.com/\\" to look at your credit report","verdict":1},{"statement":"Dispute adverse items found on credit reports.","reason":"The context advises that \\"you always suggest that people dispute everything adverse\\" to put the onus on other parties to prove the adverse item is valid","verdict":1},{"statement":"Consider purchasing credit protection to receive notifications about new credit taken in your name.","reason":"The context says: \\"it does not hurt to ask\\" and \\"get credit protection so you will be notified when new credit is taken in your name\\"","verdict":1}]}

Using the traceback and an online JSON parser, I think the issue occurs with the generation of the substring \\\' starting around character 343.

You can replicate this specific example (with the models in the code snippet below) by attempting to parse an object like this:

FaithfulnessStatementInput.parse_raw('{"question":"Would you do anything for love?","answer":"\\"I won\\\'t do that\\""}')

This results in the same validation error I get with the escape character.

ValidationError: 1 validation error for FaithfulnessStatementInput
__root__
  Invalid \escape: line 1 column 64 (char 63) [type=value_error.jsondecode, input_value='{"question":"Would you d...I won\\\'t do that\\""}', input_type=str]

And a valid version, just for reference:

FaithfulnessStatementInput.parse_raw('{"question":"Would you do anything for love?","answer":"\\"I won\'t do that\\""}')

My apologies for the long code below for replicating - obviously not all of it is required to generate this specific issue, but I wanted to include everything I am doing in this instance that's generating invalid JSON.

Steps/code to reproduce the bug:

from typing import List

import outlines
from datasets import load_dataset
from pydantic import BaseModel, validate_call

data = load_dataset("explodinggradients/fiqa", "ragas_eval")
a = data["baseline"].select([16]).to_dict()


class FaithfulnessStatementInput(BaseModel):
    question: str
    answer: str


class FaithfulnessStatementOutput(BaseModel):
    statements: List[str]


class FaithfulnessStatementExample(
    FaithfulnessStatementInput, FaithfulnessStatementOutput
):
    @property
    def statements_json(self):
        return self.model_dump_json(include=["statements"])


STATEMENTS_EXAMPLE_1 = FaithfulnessStatementExample(
    question="Who was Albert Einstein and what is he best known for?",
    answer="He was a German-born theoretical physicist, widely acknowledged to be one of the greatest and most influential physicists of all time. He was best known for developing the theory of relativity, he also made important contributions to the development of the theory of quantum mechanics.",
    statements=[
        "Albert Einstein, a German-born theoretical physicist, is renowned for being one of the most influential physicists in history.",
        "Albert Einstein was best known for his theory of relativity.",
        "Einstein's contributions significantly advanced the field of quantum mechanics",
        "Recognized globally, Einstein's work has profoundly impacted the scientific community",
        "Einstein's groundbreaking theories continue to shape our understanding of physics today.",
    ],
)
STATEMENTS_EXAMPLE_2 = FaithfulnessStatementExample(
    question="Cadmium Chloride is slightly soluble in this chemical, it is also called what?",
    answer="alcohol",
    statements=["Cadmium Chloride is slightly soluble in alcohol."],
)
STATEMENTS_EXAMPLE_3 = FaithfulnessStatementExample(
    question="Were Were Hitler and Benito Mussolini of the same nationality?",
    answer="Sorry, I can't provide answer to that question.",
    statements=[],
)


DEFAULT_STATEMENTS_EXAMPLES = [
    STATEMENTS_EXAMPLE_1,
    STATEMENTS_EXAMPLE_2,
    STATEMENTS_EXAMPLE_3,
]


@outlines.prompt
@validate_call
def faithfulness_statements(
    input: FaithfulnessStatementInput,  # noqa
    examples: List[FaithfulnessStatementExample] = DEFAULT_STATEMENTS_EXAMPLES,  # noqa
):
    # This is a combination of the RAGAS template and the DeepEval template
    """
    Anaylize the provided question and answer pairs and identify one or more statements from each sentence in the given answer. \
    A statement is a claim or informational point that is present in the answer or can be inferred from the answer and question.

    ## Examples:
    {% for example in examples %}
    Question: {{example.question}}
    Answer: {{example.answer}}
    Statements: {{example.statements_json}}

    {% endfor %}

    ## Actual:
    Question: {{input.question}}
    Answer: {{input.answer}}
    Statements:
    """


model_path = "models/openhermes-2.5-neural-chat-v3-3-slerp.Q4_K_M.gguf"

model = outlines.models.llamacpp(model_path, n_ctx=0, max_tokens=0, n_gpu_layers=-1)

generator = outlines.generate.json(model, FaithfulnessStatementOutput)

st = FaithfulnessStatementInput(
    question=a["question"][0],
    answer=a["answer"][0],
)

statements_result = generator(
    faithfulness_statements(input=st, examples=DEFAULT_STATEMENTS_EXAMPLES)
)


class FaithfulnessNLIInput(BaseModel):
    context: List[str]
    statements: List[str]


class FaithfulnessNLIAnswer(BaseModel):
    statement: str
    reason: str
    verdict: int


class FaithfulnessNLIOutput(BaseModel):
    answer: List[FaithfulnessNLIAnswer]


class FaithfulnessNLIExample(FaithfulnessNLIInput, FaithfulnessNLIOutput):
    @property
    def answer_json(self):
        return self.model_dump_json(include=["answer"])


NLI_EXAMPLE_1 = FaithfulnessNLIExample(
    context=[
        "John is a student at XYZ University. He is pursuing a degree in Computer Science. He is enrolled in several courses this semester, including Data Structures, Algorithms, and Database Management. John is a diligent student and spends a significant amount of time studying and completing assignments. He often stays late in the library to work on his projects."
    ],
    statements=[
        "John is majoring in Biology.",
        "John is taking a course on Artificial Intelligence.",
        "John is a dedicated student.",
        "John has a part-time job.",
    ],
    answer=[
        FaithfulnessNLIAnswer(
            statement="John is majoring in Biology.",
            reason="John's major is explicitly mentioned as Computer Science. There is no information suggesting he is majoring in Biology.",
            verdict=0,
        ),
        FaithfulnessNLIAnswer(
            statement="John is taking a course on Artificial Intelligence.",
            reason="The context mentions the courses John is currently enrolled in, and Artificial Intelligence is not mentioned. Therefore, it cannot be deduced that John is taking a course on AI.",
            verdict=0,
        ),
        FaithfulnessNLIAnswer(
            statement="John is a dedicated student.",
            reason="The context states that he spends a significant amount of time studying and completing assignments. Additionally, it mentions that he often stays late in the library to work on his projects, which implies dedication.",
            verdict=1,
        ),
        FaithfulnessNLIAnswer(
            statement="John has a part-time job.",
            reason="There is no information given in the context about John having a part-time job.",
            verdict=0,
        ),
    ],
)
NLI_EXAMPLE_2 = FaithfulnessNLIExample(
    context=[
        "Photosynthesis is a process used by plants, algae, and certain bacteria to convert light energy into chemical energy."
    ],
    statements=["Albert Einstein was a genius."],
    answer=[
        FaithfulnessNLIAnswer(
            statement="Albert Einstein was a genius.",
            reason="The context and statement are unrelated",
            verdict=0,
        ),
    ],
)
NLI_EXAMPLE_3 = FaithfulnessNLIExample(
    context=[
        "Albert Einstein was a German-born theoretical physicist who is widely held to be one of the greatest and most influential scientists of all time."
    ],
    statements=[],
    answer=[
        FaithfulnessNLIAnswer(
            statement="",
            reason="No statements were provided",
            verdict=-1,
        ),
    ],
)

DEFAULT_NLI_EXAMPLES = [NLI_EXAMPLE_1, NLI_EXAMPLE_2, NLI_EXAMPLE_3]


@outlines.prompt
def faithfulness_nli(
    input: FaithfulnessNLIInput,  # noqa: ARG001
    examples: List[FaithfulnessNLIExample] = DEFAULT_NLI_EXAMPLES,  # noqa: ARG001
):
    """
    You are an expert in natural language inference. \
    The goal is to determine if a given statement can be logically inferred or deduced from the provided context.
    Use only 'Yes' (1), 'No' (0) and 'Null' (-1) as verdict.

    'Yes' (1) means the statement can be inferred from the context with certainty.
    'No' (0) means the statement cannot be inferred from the context or contradicts the information provided.
    'Null' (-1) means there is insufficient information in the context to determine if the statement is true or false.

    ## Examples:
    {% for example in examples %}
    Context: {{example.context}}
    Statements: {{example.statements}}
    Answer: {{example.answer_json}}

    {% endfor %}

    ## Actual:
    Context: {{input.context}}
    Statements: {{input.statements}}
    Answer:
    """


nli_task = FaithfulnessNLIInput(
    context=a["contexts"][0], statements=statements_result.statements
)

generator_nli = outlines.generate.json(model, FaithfulnessNLIOutput)

requests_batched = [
    faithfulness_nli(input=nli_task, examples=DEFAULT_NLI_EXAMPLES) for _ in range(100)
]
# Run until ValidationError is generated
nli_results_batched = generator_nli(requests_batched)

Expected result:

Valid JSON with no invalid escapes that can be parsed back into the pydantic model.

Error message:

---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
File ~/.pyenv/versions/3.10.10/envs/my-env/lib/python3.10/site-packages/pydantic/main.py:1097, in BaseModel.parse_raw(cls, b, content_type, encoding, proto, allow_pickle)
   1096 try:
-> 1097     obj = parse.load_str_bytes(
   1098         b,
   1099         proto=proto,
   1100         content_type=content_type,
   1101         encoding=encoding,
   1102         allow_pickle=allow_pickle,
   1103     )
   1104 except (ValueError, TypeError) as exc:

File ~/.pyenv/versions/3.10.10/envs/my-env/lib/python3.10/site-packages/pydantic/deprecated/parse.py:49, in load_str_bytes(b, content_type, encoding, proto, allow_pickle, json_loads)
     48         b = b.decode(encoding)
---> 49     return json_loads(b)  # type: ignore
     50 elif proto == Protocol.pickle:

File ~/.pyenv/versions/3.10.10/lib/python3.10/json/__init__.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    343 if (cls is None and object_hook is None and
    344         parse_int is None and parse_float is None and
    345         parse_constant is None and object_pairs_hook is None and not kw):
--> 346     return _default_decoder.decode(s)
    347 if cls is None:

File ~/.pyenv/versions/3.10.10/lib/python3.10/json/decoder.py:337, in JSONDecoder.decode(self, s, _w)
    333 """Return the Python representation of ``s`` (a ``str`` instance
    334 containing a JSON document).
    335 
    336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    338 end = _w(s, end).end()

File ~/.pyenv/versions/3.10.10/lib/python3.10/json/decoder.py:353, in JSONDecoder.raw_decode(self, s, idx)
    352 try:
--> 353     obj, end = self.scan_once(s, idx)
    354 except StopIteration as err:

JSONDecodeError: Invalid \escape: line 1 column 343 (char 342)

During handling of the above exception, another exception occurred:

ValidationError                           Traceback (most recent call last)
File ~/.pyenv/versions/3.10.10/envs/my-env/lib/python3.10/site-packages/outlines/models/llamacpp.py:78, in LlamaSequenceGenerator.__call__(self, prompts, max_tokens, stop_at, rng, **model_kwargs)
     77 try:
---> 78     formatted = [self.format_sequence(sequence) for sequence in results]
     79 except Exception as e:

File ~/.pyenv/versions/3.10.10/envs/my-env/lib/python3.10/site-packages/outlines/models/llamacpp.py:78, in <listcomp>(.0)
     77 try:
---> 78     formatted = [self.format_sequence(sequence) for sequence in results]
     79 except Exception as e:

File ~/.pyenv/versions/3.10.10/envs/my-env/lib/python3.10/site-packages/outlines/generate/json.py:50, in json.<locals>.<lambda>(x)
     49     generator = regex(model, regex_str, sampler)
---> 50     generator.format_sequence = lambda x: schema_object.parse_raw(x)
     51 elif callable(schema_object):

File ~/.pyenv/versions/3.10.10/envs/my-env/lib/python3.10/site-packages/pydantic/main.py:1124, in BaseModel.parse_raw(cls, b, content_type, encoding, proto, allow_pickle)
   1118     error: pydantic_core.InitErrorDetails = {
   1119         # The type: ignore on the next line is to ignore the requirement of LiteralString
   1120         'type': pydantic_core.PydanticCustomError(type_str, str(exc)),  # type: ignore
   1121         'loc': ('__root__',),
   1122         'input': b,
   1123     }
-> 1124     raise pydantic_core.ValidationError.from_exception_data(cls.__name__, [error])
   1125 return cls.model_validate(obj)

ValidationError: 1 validation error for FaithfulnessNLIOutput
__root__
  Invalid \escape: line 1 column 343 (char 342) [type=value_error.jsondecode, input_value='{"answer":[{"statement":...name\\"","verdict":1}]}', input_type=str]

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In[8], line 332
    326 generator_nli = outlines.generate.json(model, FaithfulnessNLIOutput)
    327 nli_result = generator_nli(
    328     faithfulness_nli(input=nli_task, examples=DEFAULT_NLI_EXAMPLES)
    329 )
--> 332 nli_results_2 = [
    333     generator_nli(faithfulness_nli(input=nli_task, examples=DEFAULT_NLI_EXAMPLES))
    334     for _ in range(3)
    335 ]
    337 requests_batched = [
    338     faithfulness_nli(input=nli_task, examples=DEFAULT_NLI_EXAMPLES) for _ in range(100)
    339 ]
    340 nli_results_batched = generator_nli(requests_batched)

Cell In[8], line 333, in <listcomp>(.0)
    326 generator_nli = outlines.generate.json(model, FaithfulnessNLIOutput)
    327 nli_result = generator_nli(
    328     faithfulness_nli(input=nli_task, examples=DEFAULT_NLI_EXAMPLES)
    329 )
    332 nli_results_2 = [
--> 333     generator_nli(faithfulness_nli(input=nli_task, examples=DEFAULT_NLI_EXAMPLES))
    334     for _ in range(3)
    335 ]
    337 requests_batched = [
    338     faithfulness_nli(input=nli_task, examples=DEFAULT_NLI_EXAMPLES) for _ in range(100)
    339 ]
    340 nli_results_batched = generator_nli(requests_batched)

File ~/.pyenv/versions/3.10.10/envs/my-env/lib/python3.10/site-packages/outlines/models/llamacpp.py:81, in LlamaSequenceGenerator.__call__(self, prompts, max_tokens, stop_at, rng, **model_kwargs)
     79 except Exception as e:
     80     print(results)
---> 81     raise ValueError(f"Error formatting sequences: {e}")
     83 return formatted if len(formatted) > 1 else formatted[0]

ValueError: Error formatting sequences: 1 validation error for FaithfulnessNLIOutput
__root__
  Invalid \escape: line 1 column 343 (char 342) [type=value_error.jsondecode, input_value='{"answer":[{"statement":...name\\"","verdict":1}]}', input_type=str]

Outlines/Python version information:

python -c "import sys; print('Python', sys.version)"
pip freeze
0.0.36
Python 3.10.10 (main, Jun 19 2023, 11:34:34) [Clang 14.0.0 (clang-1400.0.29.202)]
accelerate==0.28.0
aiohttp==3.9.3
aiosignal==1.3.1
altair==5.2.0
annotated-types==0.6.0
anyio==4.3.0
appdirs==1.4.4
appnope==0.1.4
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens==2.4.1
async-lru==2.0.4
async-timeout==4.0.3
attrs==23.2.0
Babel==2.14.0
beautifulsoup4==4.12.3
bleach==6.1.0
boto3==1.34.63
botocore==1.34.63
bpemb==0.3.4
certifi==2024.2.2
cffi==1.16.0
charset-normalizer==3.3.2
click==8.1.7
cloudpickle==3.0.0
comm==0.2.2
conllu==4.5.3
contourpy==1.2.0
cycler==0.12.1
dataclasses-json==0.6.4
datasets==2.18.0
debugpy==1.8.1
decorator==5.1.1
defusedxml==0.7.1
Deprecated==1.2.14
dill==0.3.8
diskcache==5.6.3
distro==1.9.0
docstring-parser==0.15
exceptiongroup==1.2.0
executing==2.0.1
fastapi==0.110.0
fastjsonschema==2.19.1
filelock==3.13.1
flair==0.13.1
fonttools==4.49.0
fqdn==1.5.1
frozenlist==1.4.1
fsspec==2024.2.0
ftfy==6.1.3
gdown==5.1.0
gensim==4.3.2
h11==0.14.0
httpcore==1.0.4
httpx==0.27.0
huggingface-hub==0.21.4
idna==3.6
instructor==0.6.4
interegular==0.3.3
ipykernel==6.29.3
ipython==8.22.2
ipywidgets==8.1.2
isoduration==20.11.0
Janome==0.5.0
jedi==0.19.1
Jinja2==3.1.3
jmespath==1.0.1
joblib==1.3.2
json5==0.9.24
jsonpatch==1.33
jsonpointer==2.4
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
jupyter==1.0.0
jupyter-console==6.6.3
jupyter-events==0.9.1
jupyter-lsp==2.2.4
jupyter_client==8.6.1
jupyter_core==5.7.2
jupyter_server==2.13.0
jupyter_server_terminals==0.5.3
jupyterlab==4.1.5
jupyterlab_pygments==0.3.0
jupyterlab_server==2.25.4
jupyterlab_widgets==3.0.10
kiwisolver==1.4.5
langchain==0.1.12
langchain-community==0.0.28
langchain-core==0.1.32
langchain-openai==0.0.8
langchain-text-splitters==0.0.1
langdetect==1.0.9
langsmith==0.1.27
lark==1.1.9
llama_cpp_python==0.2.56
llvmlite==0.42.0
lxml==5.1.0
markdown-it-py==3.0.0
MarkupSafe==2.1.5
marshmallow==3.21.1
matplotlib==3.8.3
matplotlib-inline==0.1.6
mdurl==0.1.2
mistune==3.0.2
more-itertools==10.2.0
mpld3==0.5.10
mpmath==1.3.0
multidict==6.0.5
multiprocess==0.70.16
mypy-extensions==1.0.0
nbclient==0.10.0
nbconvert==7.16.2
nbformat==5.10.3
nest-asyncio==1.6.0
networkx==3.2.1
notebook==7.1.2
notebook_shim==0.2.4
numba==0.59.0
numpy==1.26.4
openai==1.13.3
orjson==3.9.15
outlines==0.0.36
overrides==7.7.0
packaging==23.2
pandas==2.2.1
pandocfilters==1.5.1
parso==0.8.3
pexpect==4.9.0
pillow==10.2.0
platformdirs==4.2.0
pptree==3.1
prometheus_client==0.20.0
prompt-toolkit==3.0.43
protobuf==5.26.0
psutil==5.9.8
ptyprocess==0.7.0
pure-eval==0.2.2
pyarrow==15.0.1
pyarrow-hotfix==0.6
pycparser==2.21
pydantic==2.6.3
pydantic-settings==2.2.1
pydantic_core==2.16.3
Pygments==2.17.2
pyparsing==3.1.2
pysbd==0.3.4
PySocks==1.7.1
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-json-logger==2.0.7
pytorch_revgrad==0.2.0
pytz==2024.1
PyYAML==6.0.1
pyzmq==25.1.2
qtconsole==5.5.1
QtPy==2.4.1
ragas==0.1.4
referencing==0.33.0
regex==2023.12.25
requests==2.31.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.7.1
rpds-py==0.18.0
ruff==0.3.2
s3transfer==0.10.1
safetensors==0.4.2
scikit-learn==1.4.1.post1
scipy==1.12.0
segtok==1.5.11
semver==3.0.2
Send2Trash==1.8.2
sentence-transformers==2.5.1
sentencepiece==0.1.99
seqeval==1.2.2
six==1.16.0
smart-open==7.0.1
sniffio==1.3.1
soupsieve==2.5
SQLAlchemy==2.0.28
sqlitedict==2.1.0
sse-starlette==2.0.0
stack-data==0.6.3
starlette==0.36.3
starlette-context==0.3.6
sympy==1.12
tabulate==0.9.0
tenacity==8.2.3
terminado==0.18.1
threadpoolctl==3.3.0
tiktoken==0.6.0
tinycss2==1.2.1
tokenizers==0.15.2
tomli==2.0.1
toolz==0.12.1
torch==2.2.1
tornado==6.4
tqdm==4.66.2
traitlets==5.14.1
transformer-smaller-training-vocab==0.3.3
transformers==4.38.2
typer==0.9.0
types-python-dateutil==2.9.0.20240316
typing-inspect==0.9.0
typing_extensions==4.10.0
tzdata==2024.1
uri-template==1.3.0
urllib3==1.26.18
uvicorn==0.28.0
wcwidth==0.2.13
webcolors==1.13
webencodings==0.5.1
websocket-client==1.7.0
widgetsnbextension==4.0.10
Wikipedia-API==0.6.0
wrapt==1.16.0
xxhash==3.4.1
yarl==1.9.4

Context for the issue:

No response

@pmbaumgartner
Copy link
Author

Just adding one more piece of context here: I notice the characters won\'t are included in the prompt through the template - the phrase that gets injected is While you can limit the sharing with other banks/insurance companies/service providers so that you won\'t get offers from them based on the data shared by the bank, you cannot limit the credit reports themselves.. I'm guessing there is something at the LLM level that is replicating this sequence of characters and then something is failing at the JSON generation.

In case it's additional help, here's is the output of json.dumps on this phrase:

In [27]: print(json.dumps(r"While you can limit the sharing with other banks/insurance companies/service providers so that you won\'t get offer
    ...: s from them based on the data shared by the bank, you cannot limit the credit reports themselves."))
"While you can limit the sharing with other banks/insurance companies/service providers so that you won\\'t get offers from them based on the data shared by the bank, you cannot limit the credit reports themselves."

@gautierdag
Copy link

I'm also seeing the same issue (mostly with Mistral models).. I think this is a hard one to solve, but ideally JSON decoding should prevent incorrect use of escape characters.

@pmbaumgartner
Copy link
Author

I also see this the most with Mistral models. Other's I've evaluated against are Hermes-2-Pro-Mistral-7B, alphamonarch-7b, openhermes-2.5-neural-chat-v3-3-slerp, and mistral-7b-instruct-v0.2 - the last one having this issue most frequently.

@rlouf
Copy link
Member

rlouf commented Mar 25, 2024

Have you tried different white space patterns?

@pmbaumgartner
Copy link
Author

I haven't with this specific problem, but I will give it a shot. Though I have to say it's not clear to me how it would help with this specific issue, since that would modify the whitespace but not prevent it from generating JSON with invalid escape characters - unless I'm missing something.

@pmbaumgartner
Copy link
Author

pmbaumgartner commented Mar 26, 2024

Here is a smaller replicable example.

import outlines
from pydantic import BaseModel


class Input(BaseModel):
    value: str


kwargs = {"n_ctx": 0, "max_tokens": 0, "n_gpu_layers": -1, "verbose": False}
model = outlines.models.llamacpp(
    "models/mistral-7b-instruct-v0.2.Q4_K_M.gguf",
    **kwargs,
)

generator = outlines.generate.json(model, Input)

prompt = r"""You are a helpful assistant. Your task is to return a given input word in JSON format.

Return the following value in JSON:

{"value": "won\\'t"}
"""
for _ in range(20):
    result = generator(prompt)

Should fail with the following exception:

ValueError: Error formatting sequences: 1 validation error for Input
__root__
  Invalid \escape: line 2 column 16 (char 17) [type=value_error.jsondecode, input_value='{\n  "value": "won\\\'t"\n}', input_type=str]

@rlouf
Copy link
Member

rlouf commented Apr 10, 2024

This is a more general problem with the regexes we use I think.

@rlouf rlouf added structured generation Linked to structured generation JSON labels Apr 10, 2024
@rlouf rlouf changed the title Occasional JSON Parsing Error / Invalid JSON Escape Sequence JSON generated with invalid escape characters Apr 10, 2024
@AndreasGiersch
Copy link

Is there any update on this yet? I've also encountered this problem with structured generation using pydantic. All of our used models ("mistral-7b-instruct-v0.2", "mistralai/Mixtral-8x7B-v0.1" quantized and llama-2-7Bf, llama-2-13Bf, llama-2-70Bf quantized) are affected. So far, I was not able to track down any inputs which definitely lead to a faulty output.

@umbe95
Copy link

umbe95 commented Apr 18, 2024

Also for me, same error.

@psykhi
Copy link

psykhi commented Apr 19, 2024

Same error for me. Any non trivial generation is likely to fail with Mistral 7B Instruct v0.2

Here's an example that failed for me after 5 retries.

Using outlines via vLLM openAI server.

@rlouf
Copy link
Member

rlouf commented Apr 20, 2024

The regex that is used to describe valid characters allows the generation of odd number of escape characters. This should be fixed by #829

@psykhi
Copy link

psykhi commented Apr 20, 2024

Thanks a lot @rlouf! Do you know when you'll release this? Want to open a PR in vLLM to update the deps.

@rlouf
Copy link
Member

rlouf commented Apr 20, 2024

Did you try the code in main?

@psykhi
Copy link

psykhi commented Apr 20, 2024

No I haven't., we use outlines via the vLLM openAPI server. I can set up a repro script with outlines directly but that might have to wait for next week. Can report back when I have done that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug JSON structured generation Linked to structured generation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants