Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

outlines.generate.cfg: AttributeError: 'CFGFSM' object has no attribute 'regex_fsm' #685

Closed
lapp0 opened this issue Feb 20, 2024 · 8 comments · Fixed by #865
Closed

outlines.generate.cfg: AttributeError: 'CFGFSM' object has no attribute 'regex_fsm' #685

lapp0 opened this issue Feb 20, 2024 · 8 comments · Fixed by #865
Labels

Comments

@lapp0
Copy link
Collaborator

lapp0 commented Feb 20, 2024

Describe the issue as clearly as possible:

Discord user pepp discovered a bug in the implementation of CFGFSM:

CFGFSM.regex_fsm isn't initialized until CFGFSM.allowed_token_ids() is called. If CFGFSM.next_state() is called first it results in an AttributeError.

Steps/code to reproduce the bug:

from outlines.models.transformers import transformers
from outlines.generate.cfg import cfg

arithmetic_grammar = """
    ?start: sum

    ?sum: product
        | sum "+" product   -> add
        | sum "-" product   -> sub

    ?product: atom
        | product "*" atom  -> mul
        | product "/" atom  -> div

    ?atom: NUMBER           -> number
         | "-" atom         -> neg
         | "(" sum ")"

    %import common.NUMBER
    %import common.WS_INLINE

    %ignore WS_INLINE
"""

model = transformers("gpt2")
generator = cfg(model, arithmetic_grammar)

result = generator("Question: How can you write 5*5 using addition?\nAnswer:")
print(result)

Expected result:

No Error

Error message:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[2], line 28
     25 model = transformers("gpt2")
     26 generator = cfg(model, arithmetic_grammar)
---> 28 result = generator("Question: How can you write 5*5 using addition?\nAnswer:")
     29 print(result)
     30 # 5+5+5+5+5

File ~/guidance-pydantic/.venv/lib/python3.11/site-packages/outlines/generate/api.py:200, in SequenceGenerator.__call__(self, prompts, max_tokens, stop_at, rng)
    198 while True:
    199     try:
--> 200         last_state = next(states)
    201         if max_tokens or stop_sequences:
    202             token_ids = last_state.token_ids

File ~/guidance-pydantic/.venv/lib/python3.11/site-packages/outlines/generate/generator.py:89, in sequence_generator(model, sampler, fsms, token_ids, sequence_weights, attention_masks, fsm_states, rng)
     86 fsms = reorder_fsms(fsms, ancestors)
     87 fsm_states = reorder_fsm_states(fsm_states, ancestors)
---> 89 fsm_states = get_next_fsm_states(fsms, fsm_states, next_token_ids)
     90 is_finished = is_generation_finished(fsms, fsm_states)
     92 if is_finished:

File ~/guidance-pydantic/.venv/lib/python3.11/site-packages/outlines/generate/generator.py:128, in get_next_fsm_states(fsms, fsm_states, next_token_ids)
    111 def get_next_fsm_states(
    112     fsms: List["FSM"], fsm_states: List[FSMState], next_token_ids: torch.Tensor
    113 ) -> List[FSMState]:
    114     """
    115 
    116     Parameters
   (...)
    126 
    127     """
--> 128     return [
    129         fsm.next_state(fsm_state, int(token_id[0]))
    130         for fsm, fsm_state, token_id in zip(fsms, fsm_states, next_token_ids)
    131     ]

File ~/guidance-pydantic/.venv/lib/python3.11/site-packages/outlines/generate/generator.py:129, in <listcomp>(.0)
    111 def get_next_fsm_states(
    112     fsms: List["FSM"], fsm_states: List[FSMState], next_token_ids: torch.Tensor
    113 ) -> List[FSMState]:
    114     """
    115 
    116     Parameters
   (...)
    126 
    127     """
    128     return [
--> 129         fsm.next_state(fsm_state, int(token_id[0]))
    130         for fsm, fsm_state, token_id in zip(fsms, fsm_states, next_token_ids)
    131     ]

File ~/guidance-pydantic/.venv/lib/python3.11/site-packages/outlines/fsm/fsm.py:331, in CFGFSM.next_state(self, state, token_id)
    328     self.reset_state = False
    329     state = self.first_state
--> 331 return self.regex_fsm.next_state(state, token_id)

AttributeError: 'CFGFSM' object has no attribute 'regex_fsm'

Outlines/Python version information:

outlines==0.0.32

Context for the issue:

We need to fix CFGFSM so it initializes all instance variables in __init__

@lapp0 lapp0 added the bug label Feb 20, 2024
@silverriver
Copy link
Contributor

I have encountered the same issue. I can not even run the example provided in the README with gpt2 model:

import outlines
arithmetic_grammar = """
    ?start: expression

    ?expression: term (("+" | "-") term)*

    ?term: factor (("*" | "/") factor)*

    ?factor: NUMBER
           | "-" factor
           | "(" expression ")"

    %import common.NUMBER
"""

model = outlines.models.transformers("openai-community/gpt2")
generator = outlines.generate.cfg(model, arithmetic_grammar)
sequence = generator("Alice had 4 apples and Bob ate 2. Write an expression for Alice's apples:")

print(sequence)

@Reichenbachian
Copy link

I'm also running into this.

@tarsur909
Copy link

Hi,

Has anyone had a chance to fix this issue? I am unable to run the example provided in the README and get this same AttributeError.

@maxtheman
Copy link

maxtheman commented Mar 27, 2024

I'm also running into this in CFGGuide.

...
    [421](.venv/lib/python3.11/site-packages/outlines/fsm/guide.py:421)     self.reset_state = False
    [422](.venv/lib/python3.11/site-packages/outlines/fsm/guide.py:422)     state = self.start_state
--> [424](.venv/lib/python3.11/site-packages/outlines/fsm/guide.py:424) return self.regex_fsm.get_next_state(state, token_id)

AttributeError: 'CFGGuide' object has no attribute 'regex_fsm'```

I believe this is a regression that was introduced in 0.26 — going back to that version resolves this error (but the output is nonsensical.

@Brentably
Copy link

also having this error

@kajogo777
Copy link

I'm running into the same bug, getting

AttributeError: 'CFGGuide' object has no attribute 'regex_fsm'

@jjdvd234
Copy link

Hey,

I'm experiencing the same issue too. I decided to use SynCode for grammar guided generation. It's very efficient and works with any Lark grammar. I've linked the repo here:
https://github.com/uiuc-focal-lab/syncode/tree/main

@RevanthRameshkumar
Copy link

@jjdvd234 , does that still work for you? I tried it just now and am getting nonsense outputs with the provided example code (no errors, but clearly it has parsing errors). I'll try again with a llama model...maybe phi model is the issue?

@rlouf rlouf closed this as completed in #865 May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

9 participants