Format `prompts` Using Chat Templates in `SequenceGeneratorAdapter` #987

lapp0 · 2024-06-19T09:29:28Z

Related: #756

What behavior of the library made you think about the improvement?

Currently when using outlines.generate, chat templates aren't applied by default. It's awkward and unintuitive to structure your prompts as chat templates. For example, a well structured input for a llama-3 model might look like

generator = outlines.generate.json(...)

my_prompt = """<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nProvide me JSON Data<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"""
generator(my_prompt)

I'd prefer

generator("Provide me JSON Data")

Why We Should Apply Chat Templates by Default

Without the application of chat templates, the model emulates the continuation of a monologue. Where-as chat template format generally follows a query-response structure.

No Chat Template

>>> output = model.generate(**tokenizer("What is 1 + 1?", return_tensors="pt"), max_length=32)                                                                                                                    
>>> tokenizer.decode(output[0])
"<s> What is 1 + 1?\n\nThis question has been asked by many people, but I don't understand the answer.\n\nCould"

>>> output = model.generate(**tokenizer("Give me a random color:", return_tensors="pt"), max_length=32)
>>> tokenizer.decode(output[0])
'<s> Give me a random color:\n\n- Response: A random color can be represented in hexadecimal format as #RRGGBB,'

With Chat Template

output = model.generate(**tokenizer('<s><|user|> What is 1 + 1?<|end|><|assistant|>', return_tensors="pt"), max_length=32)
tokenizer.decode(output[0])
'<s><s><|user|> What is 1 + 1?<|end|><|assistant|> 1 + 1 equals 2. This is a basic arithmetic addition problem. When you'

>>> output = model.generate(**tokenizer('<s><|user|> Give me a random color:<|end|><|assistant|>', return_tensors="pt"), max_length=32)
>>> tokenizer.decode(output[0])
"<s><s><|user|> Give me a random color:<|end|><|assistant|> The random color I'll describe for you is a vibrant shade of teal, with"

How would you like it to behave?

By default generator(prompt) applies the chat template.

Current behavior should remain available via generator(prompt, raw=True)

Alternatively it might make sense to have the raw argument in the generator constructing function (e.g. outlines.generate.text(model, raw=True)

The text was updated successfully, but these errors were encountered:

lapp0 added enhancement text Linked to text generation tokenization labels Jun 19, 2024

lapp0 mentioned this issue Jun 21, 2024

Bad performance in generating string properties given a JSON schema #994

Closed

leloykun mentioned this issue Jul 5, 2024

Auto-apply chat template in SequenceGenerator and SequenceGeneratorAdapter, if available #1019

Open

lapp0 mentioned this issue Aug 20, 2024

Outlines examples not working #1103

Open

rlouf changed the title ~~Format prompts Using Chat Templates in SequenceGeneratorAdapter~~ Format prompts Using Chat Templates in SequenceGeneratorAdapter Aug 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Format `prompts` Using Chat Templates in `SequenceGeneratorAdapter` #987

Format `prompts` Using Chat Templates in `SequenceGeneratorAdapter` #987

lapp0 commented Jun 19, 2024 •

edited

Loading

Format prompts Using Chat Templates in SequenceGeneratorAdapter #987

Format prompts Using Chat Templates in SequenceGeneratorAdapter #987

Comments

lapp0 commented Jun 19, 2024 • edited Loading

What behavior of the library made you think about the improvement?

Why We Should Apply Chat Templates by Default

No Chat Template

With Chat Template

How would you like it to behave?

Format `prompts` Using Chat Templates in `SequenceGeneratorAdapter` #987

Format `prompts` Using Chat Templates in `SequenceGeneratorAdapter` #987

lapp0 commented Jun 19, 2024 •

edited

Loading