`stop_at` argument of `outlines.generate.text` results in incorrect output #626

lapp0 · 2024-02-09T19:11:32Z

Describe the issue as clearly as possible:

I'm using the beam search example from https://github.com/outlines-dev/outlines/blob/main/docs/reference/samplers.md modified with stop_at="\n"

Steps/code to reproduce the bug:

from outlines import models, generate, samplers


model = models.transformers("mistralai/Mistral-7B-Instruct-v0.2")
sampler = samplers.beam_search(beams=5)

generator = generate.text(model, sampler=sampler, stop_at="\n")
answer = generator("What is 2+2?")

print(answer)

Expected result:

>>> answer
'The answer to this question is  4'

Error message:

>>> answer
['\n', '\n', '\n', '\n', '\n']

Outlines/Python version information:

Version information

0.1.dev497+g0ea5a98
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
annotated-types==0.6.0
attrs==23.2.0
blinker==1.4
certifi==2024.2.2
charset-normalizer==3.3.2
cloudpickle==3.0.0
cryptography==3.4.8
dbus-python==1.2.18
diskcache==5.6.3
distro==1.7.0
filelock==3.13.1
fsspec==2024.2.0
httplib2==0.20.2
huggingface-hub==0.20.3
idna==3.6
importlib-metadata==4.6.4
interegular==0.3.3
jeepney==0.7.1
Jinja2==3.1.3
joblib==1.3.2
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
keyring==23.5.0
lark==1.1.9
launchpadlib==1.10.16
lazr.restfulclient==0.14.4
lazr.uri==1.0.6
llvmlite==0.42.0
MarkupSafe==2.1.5
more-itertools==8.10.0
mpmath==1.3.0
nest-asyncio==1.6.0
networkx==3.2.1
numba==0.59.0
numpy==1.26.4
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.19.3
nvidia-nvjitlink-cu12==12.3.101
nvidia-nvtx-cu12==12.1.105
oauthlib==3.2.0
outlines==0.0.27
packaging==23.2
pydantic==2.6.1
pydantic_core==2.16.2
PyGObject==3.42.1
PyJWT==2.3.0
pyparsing==2.4.7
python-apt==2.4.0+ubuntu2
PyYAML==6.0.1
referencing==0.33.0
regex==2023.12.25
requests==2.31.0
rpds-py==0.17.1
safetensors==0.4.2
scipy==1.12.0
SecretStorage==3.3.1
six==1.16.0
sympy==1.12
tokenizers==0.15.1
torch==2.2.0
tqdm==4.66.1
transformers==4.37.2
triton==2.2.0
typing_extensions==4.9.0
UNKNOWN @ file:///root/outlines
urllib3==2.2.0
wadllib==1.3.6
zipp==1.0.0

Context for the issue:

Trying to improve generation quality (#616 (comment))

The text was updated successfully, but these errors were encountered:

rlouf · 2024-02-09T19:15:55Z

What's the result without stop_at?

lapp0 · 2024-02-09T19:43:03Z

What's the result without stop_at?

Generation which never ends.

Per #616 (comment) it starts going on about France, then other random stuff

Generation: ['', 'What', 'is', '', '2', '+', '2', '?', '\n', '\n', '2', '+', '2', 'is', 'the', 'sum', 'of', 'the', 'numbers', '', '2', 'and', '', '2', '.', 'The', 'answer', 'is', '', '4', '.', '\n', '\n', 'What', 'is', 'the', 'capital', 'city', 'of', 'France', '?', '\n', '\n', 'The', 'capital', 'city', 'of', 'France', 'is', 'Paris', '.']  # it doesn't stop here, there's more which was truncated

(To be clear, this appears to be a model (or generation) issue, not a beam search issue)

rlouf · 2024-02-09T19:59:21Z

I don't understand how that's an issue with Outlines, rather than a model quirk. Are the beams giving different answers?

lapp0 · 2024-02-09T21:38:47Z

The issue with Outlines is that if stop_at is set it just returns the stop token.

rlouf · 2024-02-10T10:01:51Z

If you specify stop_at="\n" and the model returns \n I would expect the generation to stop there.

lapp0 added the bug label Feb 9, 2024

dottxt-ai locked and limited conversation to collaborators Feb 10, 2024

rlouf converted this issue into discussion #629 Feb 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

`stop_at` argument of `outlines.generate.text` results in incorrect output #626

`stop_at` argument of `outlines.generate.text` results in incorrect output #626

lapp0 commented Feb 9, 2024 •

edited

Loading

rlouf commented Feb 9, 2024

lapp0 commented Feb 9, 2024 •

edited

Loading

rlouf commented Feb 9, 2024

lapp0 commented Feb 9, 2024 •

edited

Loading

rlouf commented Feb 10, 2024

This issue was moved to a discussion.

This issue was moved to a discussion.

stop_at argument of outlines.generate.text results in incorrect output #626

stop_at argument of outlines.generate.text results in incorrect output #626

Comments

lapp0 commented Feb 9, 2024 • edited Loading

Describe the issue as clearly as possible:

Steps/code to reproduce the bug:

Expected result:

Error message:

Outlines/Python version information:

Context for the issue:

rlouf commented Feb 9, 2024

lapp0 commented Feb 9, 2024 • edited Loading

rlouf commented Feb 9, 2024

lapp0 commented Feb 9, 2024 • edited Loading

rlouf commented Feb 10, 2024

This issue was moved to a discussion.

`stop_at` argument of `outlines.generate.text` results in incorrect output #626

`stop_at` argument of `outlines.generate.text` results in incorrect output #626

lapp0 commented Feb 9, 2024 •

edited

Loading

lapp0 commented Feb 9, 2024 •

edited

Loading

lapp0 commented Feb 9, 2024 •

edited

Loading