Add integration for ExllamaV2 #462

kimjaewon96 · 2023-12-21T06:05:45Z

library: https://github.com/turboderp/exllamav2
To install exllamav2, you have to install the correct python & cuda version from the releases.

With proper caching, it's 3~4x faster than gptq.

rlouf · 2023-12-21T09:33:35Z

Thank you! Did you try guided generation with this integration?

kimjaewon96 · 2023-12-22T02:08:37Z

prompt = "What is the IP address of the Google DNS servers? "
unguided = outlines.generate.text(model, max_tokens=30)(prompt)
guided = outlines.generate.regex(
    model,
    r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)",
    max_tokens=30,
)(prompt)
print(unguided)
#\n\n Google Public DNS is a free, global, open recursive DNS service from Google. There are two IP addresses for Google Public D
print(guided)
#0.0.0.0

Yes, it works.

One thing I forgot to mention is that exl2 doesn't automatically download models from huggingface, so you should write the local folder path containing the model.

rlouf · 2023-12-22T20:42:32Z

Thank you for contributing! We will need to add some documentation for that in the near future.

library: https://github.com/turboderp/exllamav2 To install exllamav2, you have to install the correct python & cuda version from the releases. With proper caching, it's 3~4x faster than gptq.

dnhkng · 2024-01-25T08:17:03Z

Can someone provide a quick example for loading the exllama model?

kimjaewon96 added 3 commits December 22, 2023 21:29

Add integration for ExllamaV2

04dfc64

typo

93c32b5

make style

393b84c

rlouf force-pushed the main branch from cf8308a to 393b84c Compare December 22, 2023 20:29

rlouf merged commit 6084f4c into outlines-dev:main Dec 22, 2023
5 checks passed

rlouf mentioned this pull request Dec 22, 2023

Add documentation for models #477

Closed

rlouf linked an issue Feb 10, 2024 that may be closed by this pull request

Exllama support #338

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add integration for ExllamaV2 #462

Add integration for ExllamaV2 #462

kimjaewon96 commented Dec 21, 2023

rlouf commented Dec 21, 2023

kimjaewon96 commented Dec 22, 2023

rlouf commented Dec 22, 2023

dnhkng commented Jan 25, 2024

Add integration for ExllamaV2 #462

Add integration for ExllamaV2 #462

Conversation

kimjaewon96 commented Dec 21, 2023

rlouf commented Dec 21, 2023

kimjaewon96 commented Dec 22, 2023

rlouf commented Dec 22, 2023

dnhkng commented Jan 25, 2024