Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When integrating Outlines with vLLM I faced the following issues, which are fixed in this PR:
vllm.LLM.generate
then within the internals of vLLM acopy.deepcopy
of the vLLMSamplingParams
is made, which includes the logits processor from Outlines (RegexLogitsProcessor
, say). This requires everything to be pickleable, and theRegexLogitsProcessor.fsm.vocabulary
is adict_values
object, which doesn't satisfy that. The fix is easy: just convert it to a list. This doesn't affect how thisvocabulary
variable is being used in the code.RegexLogitsProcessor
takes anllm
argument, which the docstring states should be avllm.LLM
object, but then attempts to extract the underlying tokenizer viallm.tokenizer.tokenizer
. The tokenizer ofvllm.LLM
currently lies in thevllm.LLM.llm_engine.tokenizer.tokenizer
attribute, but this is a big mess and isn't backwards compatible with previous vLLM versions. Instead, they have a convenience method,vllm.LLM.get_tokenizer
, which fetches the tokenizer. To remain backwards compatibility, in case people have suppliedvllm.LLM.llm_engine
directly intoRegexLogitsProcessor
, it falls back to atokenizer
ortokenizer.tokenizer
attribute.I also updated the vLLM example script, as that was outdated as well (used the previous
_patched_apply_logits_processors
).Closes #704