Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize parse state cloning #172

Merged

Conversation

brandonwillard
Copy link
Contributor

@brandonwillard brandonwillard commented Jul 6, 2023

This PR makes some simple generalizations to our custom lark parse state cloning. It also adds SQL-guided generation as an extension to the existing LLM example.

N.B. Don't expect the example model (i.e. codegen-350M-mono) to actually produce meaningful/interesting SQL; it was trained primarily/exclusively on Python code. Also, the example SQL grammar could have issues of its own.

@brandonwillard brandonwillard force-pushed the generalize-parse-state-cloning branch from 5224979 to 7f3fae3 Compare July 7, 2023 16:26
@@ -26,23 +27,33 @@
checkpoint, trust_remote_code=True, revision=revision
).to(device)

input_text = "def "
inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
sql_grammar_url = "https://github.com/zbrookle/sql_to_ibis/raw/0e9226da42065940ce21439d490f9fcacadc7f92/sql_to_ibis/grammar/sql.lark"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about maintaining some of the grammars we need?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can add them with the appropriate licenses, of course.

@rlouf rlouf added the structured generation Linked to structured generation label Jul 17, 2023
@rlouf rlouf merged commit bb96179 into outlines-dev:main Jul 19, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement structured generation Linked to structured generation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants