Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Constrained generation method #155

Closed
rlouf opened this issue Jun 21, 2023 · 0 comments
Closed

Add a Constrained generation method #155

rlouf opened this issue Jun 21, 2023 · 0 comments
Labels
enhancement structured generation Linked to structured generation text Linked to text generation

Comments

@rlouf
Copy link
Member

rlouf commented Jun 21, 2023

Using SMC sampling we can generate sequences that follow arbitrary constraints. All we need is a function that takes previously-generated tokens, a possible completion and returns a boolean. For instance this example from LlamPPL to constrain the sequence generated to not have longer that are more than 5 letters long:

def can_follow(str_so_far, s):
    if isinstance(s, llp.Token):
        s = str(s)
    if len(s.strip()) > 5:
        return False
    if len(s.strip()) == 0:
        return True
    if not s[0].isalpha():
        return True
    if len(str_so_far) == 0:
        return True # First token, can be alphanumeric
    words = str_so_far.split()
    if len(words) >= 1 and len(words[-1]) + len(s) <= 5:
        return True
    else:
        return False

I propose to add a Constrained subclass to Sequence :

import outlines.models as models
import outlines.text as text

model = models.transformers("gpt2")
state = text.constrained(model, can_follow)("Prompt")

We will need to add a create_proposal and a reweigh method to Sequence.

@rlouf rlouf added text Linked to text generation enhancement labels Jun 21, 2023
@rlouf rlouf modified the milestone: 0.1 Jul 13, 2023
@rlouf rlouf added the structured generation Linked to structured generation label Jul 15, 2023
@rlouf rlouf closed this as completed Feb 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement structured generation Linked to structured generation text Linked to text generation
Projects
None yet
Development

No branches or pull requests

1 participant