Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the sequence generation #366

Merged
merged 25 commits into from
Dec 8, 2023
Merged

Conversation

rlouf
Copy link
Member

@rlouf rlouf commented Nov 15, 2023

In the current design the FSM that drives the generation process and the next token sampling are coupled, which makes adding new FSMs cumbersome. We also lack the possiblity to stream the tokens. In this PR I introduce a few changes to the core of the library:

  • The logic contained in the Sequence class now happens in a generate_next_token Python generator, which returns sampled tokens one at a time. This generator is responsible for calling the model, applying logit biases and sampling the next token.
  • The FSM logic, currently contained in Regex is factored out in a new process generator. This generator is responsible for creating the logit biases, calling the models and concatenating the new token to the running sequence.

After these changes are made there is a clear path to #335, #163, #155, #53.

Closes #317. Closes #340. Closes #266. Closes #247. Closes #216. Closes #185.

TODO

  • Move prompts.py to the parent module
  • Rename sample.py to sampler(s).py
  • Move the logic contained in text one level down so everything can be called via outlines.generate
  • Handle exceptions when the number of tokens in token_ids exceeds the context length;
  • Append the new tokens to the running sequence
  • Update the masks
  • Pass token_ids, masks and KV cache to the model via dataclass or NamedTuple
  • Update and return the current sequence, last generated token ids, last logits and total sequence logprob in a single object
  • Ensure that generator can work with batches on inputs;
  • Make sure processor works with batches of sequences
  • Implement the basic FSM which stops when the eos token was generated. Implement the outlines.generate.sequence and outlines.stream.sequence functions that build the generators.
  • Create a regex FSM
  • Return pad tokens once an EOS token has been found
  • Create outlines.generate.text
  • Create outlines.generate.regex
  • Create outlines.generate.format
  • Create outlines.generate.json
  • Remove old logic
  • Return logits + sequence logprob
  • Add aliases for the old interface with deprecation waning
  • Test streaming in integration
  • Convert JSON output to dict or pydantic model
  • Add stop_at kwarg to generate.text for feature parity
  • Interface with Index should be ints and List[int]
  • index.py -> fsm.py, /index -> /fsm and fsm.py -> regex.py
  • Test batched guided generation
  • Update the paths in documentation

Notes

  • We currently assume the initial state of the FSM is labelled "0". We might want the return the initial state during init.

@rlouf rlouf added text Linked to text generation enhancement structured generation Linked to structured generation JSON labels Nov 15, 2023
@rlouf rlouf force-pushed the refactor-generator branch 15 times, most recently from 1e08ea5 to 18cecff Compare November 20, 2023 07:12
outlines/generate/generator.py Outdated Show resolved Hide resolved
outlines/generate/generator.py Outdated Show resolved Hide resolved
outlines/generate/generator.py Outdated Show resolved Hide resolved
outlines/generate/generator.py Outdated Show resolved Hide resolved
outlines/generate/generator.py Outdated Show resolved Hide resolved
outlines/generate/generator.py Outdated Show resolved Hide resolved
outlines/generate/generator.py Outdated Show resolved Hide resolved
outlines/generate/generator.py Outdated Show resolved Hide resolved
outlines/generate/generator.py Outdated Show resolved Hide resolved
outlines/generate/generator.py Outdated Show resolved Hide resolved
outlines/generate/generator.py Outdated Show resolved Hide resolved
outlines/index/index.py Outdated Show resolved Hide resolved
outlines/index/index.py Outdated Show resolved Hide resolved
@rlouf rlouf merged commit 3bb295e into outlines-dev:main Dec 8, 2023
5 checks passed
@rlouf rlouf deleted the refactor-generator branch December 8, 2023 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement JSON structured generation Linked to structured generation text Linked to text generation
Projects
None yet
2 participants