Dolphin Mistral Prompt blows up on assertion #581

chris-cortner · 2024-01-09T21:47:56Z

I'm trying to apply dolphin mistral's prompt template format:

<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{user_prompt}<|im_end|>
<|im_start|>assistant

I've tried this a couple of different ways:

quant_path = "TheBloke/dolphin-2.6-mistral-7B-AWQ"
lm = models.Transformers(quant_path, device_map="auto")
stop_char = '"'
prompt_template = '<|im_start|>system\n{system_prompt}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n'

lm2 = lm + (prompt_template.format(system_prompt="You are a helpful AI", prompt="What is the distance to mars?")
+ f'The distance to mars is "{gen("answer", max_tokens=500, stop=stop_char, temperature=0.7)}"')

And by using TransformersChat:

quant_path = "TheBloke/dolphin-2.6-mistral-7B-AWQ"
lm = models.TransformersChat(quant_path, device_map="auto")
stop_char = '"'

with system():
lm2 = lm + "You are a helpful AI"

with user():
lm2 += "What is the distance to mars?"

with assistant():
lm2 += 'The distance to mars is "' + gen("answer", max_tokens=500, stop=stop_char, temperature=0.8)

Both method produce the same error:

An assertion error is thrown in _cleanup_tokens in _model.py

Traceback (most recent call last):
File "/home/user/.cache/pypoetry/virtualenvs/llm-proficiency-testing-hKJXaDzo-py3.11/lib64/python3.11/site-packages/guidance/models/_model.py", line 309, in add
out = lm + partial_grammar
~~~^~~~~~~~~~~~~~~~~
File "/home/user/.cache/pypoetry/virtualenvs/llm-proficiency-testing-hKJXaDzo-py3.11/lib64/python3.11/site-packages/guidance/models/_model.py", line 317, in add
out = lm._run_stateless(value)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.cache/pypoetry/virtualenvs/llm-proficiency-testing-hKJXaDzo-py3.11/lib64/python3.11/site-packages/guidance/models/_model.py", line 482, in _run_stateless
for new_bytes, is_generated, new_bytes_prob, capture_groups, capture_group_log_probs, new_token_count in gen_obj:
File "/home/user/.cache/pypoetry/virtualenvs/llm-proficiency-testing-hKJXaDzo-py3.11/lib64/python3.11/site-packages/guidance/models/_model.py", line 798, in call
token_ids,token_byte_positions = self._cleanup_tokens(token_ids, token_byte_positions)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/.cache/pypoetry/virtualenvs/llm-proficiency-testing-hKJXaDzo-py3.11/lib64/python3.11/site-packages/guidance/models/_model.py", line 628, in _cleanup_tokens
assert token_byte_positions[-1] == last_pos
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Please let me know if I should provide more information.

chris-cortner · 2024-01-09T21:52:19Z

Looks like a related issue was previously closed but another person was still reporting the issue in v0.1.10
Issue #556

Also, I'm running from main as of 2 days ago.

chris-cortner · 2024-01-09T22:49:03Z

Seems to work fine with neural-chat, which is Mistral based but uses a different system prompt format:

"### System:\n{system_prompt}\n\n### User:\n{prompt}\n\n### Assistant:\n"

aalyousfi · 2024-01-18T11:58:48Z

I'm facing a similar issue, I get an assertion error when using the ChatML format. Is there a fix for that?

chris-cortner · 2024-01-18T20:29:44Z

I also added a discussion topic, hoping someone will chime in with a workaround.

RXminuS · 2024-01-30T23:12:53Z

I'm facing a similar issue but it works if I use LlamaCppChat model instead of HuggingFaceChat model. Also it's specifically to do with the<| and |> because if I replace them with << and >> the prompt works just fine.

chris-cortner · 2024-02-01T23:04:36Z

Very interesting, thanks for the tip!

shawnz mentioned this issue Feb 21, 2024

Improve HF tokenization hack to cover multiple special tokens #649

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dolphin Mistral Prompt blows up on assertion #581

Dolphin Mistral Prompt blows up on assertion #581

chris-cortner commented Jan 9, 2024 •

edited

Loading

chris-cortner commented Jan 9, 2024 •

edited

Loading

chris-cortner commented Jan 9, 2024

aalyousfi commented Jan 18, 2024

chris-cortner commented Jan 18, 2024

RXminuS commented Jan 30, 2024 •

edited

Loading

chris-cortner commented Feb 1, 2024

Dolphin Mistral Prompt blows up on assertion #581

Dolphin Mistral Prompt blows up on assertion #581

Comments

chris-cortner commented Jan 9, 2024 • edited Loading

chris-cortner commented Jan 9, 2024 • edited Loading

chris-cortner commented Jan 9, 2024

aalyousfi commented Jan 18, 2024

chris-cortner commented Jan 18, 2024

RXminuS commented Jan 30, 2024 • edited Loading

chris-cortner commented Feb 1, 2024

chris-cortner commented Jan 9, 2024 •

edited

Loading

chris-cortner commented Jan 9, 2024 •

edited

Loading

RXminuS commented Jan 30, 2024 •

edited

Loading