Skip to content

Commit

Permalink
Add llama.cpp integration
Browse files Browse the repository at this point in the history
  • Loading branch information
dtiarks authored and rlouf committed Jan 8, 2024
1 parent 417a2ca commit 03b749a
Show file tree
Hide file tree
Showing 12 changed files with 449 additions and 5 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@ __pycache__
*_version.py
docs/build
.coverage
.idea/
2 changes: 1 addition & 1 deletion docs/cookbook/index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Examples

- [Classification](classification): Classify customer requests.
- [Classification](classification.md): Classify customer requests.
- [Named Entity Extraction](extraction.md): Extract information from pizza orders.
- [Dating Profile](dating_profiles.md): Build dating profiles from descriptions using prompt templating and JSON-guided generation.
- [Chain Of Density](chain_of_density.md): Summarize documents using chain of density prompting and JSON-guided generation.
Expand Down
15 changes: 15 additions & 0 deletions docs/reference/models/llamacpp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Llama.cpp

!!! Installation

You need to install the `llama-cpp-python` library to be able to use these models in Outlines.

Outlines provides an integration with [Llama.cpp](https://github.com/ggerganov/llama.cpp) using the [llama-cpp-python library](https://github.com/abetlen/llama-cpp-python). Llamacpp allows to run quantized models on machines with limited compute.

Assuming [Phi2's weights](https://huggingface.co/TheBloke/phi-2-GGUF) are in the current directory:

```python
from outlines import models, generate

model = models.llamacpp("./phi-2.Q4_K_M.gguf", device="cpu")
```
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Generate text with the OpenAI API

!!! Installation

You need to install the `openai` and `tiktoken` libraries to be able to use the OpenAI API in Outlines.

Outlines supports models available via the OpenAI Chat API, e.g. ChatGPT and GPT-4. The following models can be used with Outlines:

```python
Expand All @@ -12,6 +16,7 @@ print(type(model))
# OpenAI
```


It is possible to pass a system message to the model when initializing it:

```python
Expand Down
4 changes: 1 addition & 3 deletions docs/reference/vllm.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,7 @@ curl http://127.0.0.1:8000/generate \

Instead of `curl`, you can also use the [requests][requests]{:target="_blank"} library from another python program.

Please consult the [vLLM documentation][vllm]{:target="_blank"} for details on additional request parameters.

You can also [read the code](https://github.com/outlines-dev/outlines/blob/main/outlines/serve/serve.py) in case you need to customize the solution to your needs.
Please consult the [vLLM documentation][vllm]{:target="_blank"} for details on additional request parameters. You can also [read the code](https://github.com/outlines-dev/outlines/blob/main/outlines/serve/serve.py) in case you need to customize the solution to your needs.

[requests]: https://requests.readthedocs.io/en/latest/
[vllm]: https://docs.vllm.ai/en/latest/index.html
46 changes: 46 additions & 0 deletions examples/llamacpp_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
from enum import Enum

import torch
from pydantic import BaseModel, constr

import outlines


class Weapon(str, Enum):
sword = "sword"
axe = "axe"
mace = "mace"
spear = "spear"
bow = "bow"
crossbow = "crossbow"


class Armor(str, Enum):
leather = "leather"
chainmail = "chainmail"
plate = "plate"


class Character(BaseModel):
name: constr(max_length=10)
age: int
armor: Armor
weapon: Weapon
strength: int


if __name__ == "__main__":
# Download model from https://huggingface.co/TheBloke/phi-2-GGUF
model = outlines.models.llamacpp("./phi-2.Q3_K_M.gguf", device="cpu")

# Construct guided sequence generator
generator = outlines.generate.json(model, Character, max_tokens=512)

# Draw a sample
rng = torch.Generator(device="cpu")
rng.manual_seed(789005)

prompt = "Instruct: You are a leading role play gamer. You have seen thousands of different characters and their attributes.\nPlease return a JSON object with common attributes of an RPG character. Give me a character description\nOutput:"

sequence = generator(prompt, rng=rng)
print(sequence)
3 changes: 2 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,8 @@ nav:
- Prompt templating: reference/prompting.md
- Outlines functions: reference/functions.md
- Models:
- OpenAI: reference/openai_text_generation.md
- OpenAI: reference/models/openai.md
- Llama.cpp: reference/models/llamacpp.md

- API Reference:
- api/index.md
Expand Down
1 change: 1 addition & 0 deletions outlines/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
"clear_cache",
"disable_cache",
"get_cache",
"Function",
"prompt",
"vectorize",
]
1 change: 1 addition & 0 deletions outlines/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
from .awq import awq
from .exllamav2 import exl2
from .gptq import gptq
from .llamacpp import LlamaCpp, llamacpp
from .mamba import Mamba, mamba
from .openai import OpenAI, openai
from .transformers import Transformer, transformers
Loading

0 comments on commit 03b749a

Please sign in to comment.