Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the README #113

Merged
merged 3 commits into from
May 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
195 changes: 106 additions & 89 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,76 +6,103 @@

Build _reliable_ workflows based on interactions with generative models.

</div>
[Prompting](#prompting) •
[Controlled generation](#controlled-generation) •
[Agents](#agents-example) •
[Sampling](#sampling-uncertainty-simulation-based-inference) •
[Examples](#examples)

</div>

## Prompt management
**Outlines** allows you to control and diagnose interactions with LLMs more effectively. Modern language models are powerful and versatile, but the way they interface with existing systems [can be very brittle](https://github.com/Significant-Gravitas/Auto-GPT/labels/invalid_json), their outputs [can be unreliable](https://arxiv.org/abs/2302.04023), and complex workflows (agents) can introduce a lot of error-prone code duplication. Outlines provides robust prompting primitives that separate the prompting from the execution logic and lead to simple implementations of few-shot generations, ReAct, meta-prompting, agents, etc. Outlines helps developers control text generation and produce predictable outputs that make the interaction with user code more robust. Its sampling-first approach allows one to diagnose issues with model-generated output more easily, and implement more robust generation methods such as [self-consistency](https://arxiv.org/abs/2203.11171) or [DiVeRSe](https://arxiv.org/abs/2206.02336).

Outlines makes it easier to write and manage prompts by encapsulating templates
inside "template functions". These functions make it possible to neatly separate
the prompt logic from the general program logic; they can be imported from other
modules and libraries.
**Outlines** is designed as a library that integrates well with the broader Python environment. Generation can be interleaved with control flow or custom function calls, prompts can be imported from other modules or libraries.

Template functions use the Jinja2 templating engine to help build complex
prompts (like few-shot examples) in a concise manner:

``` python
import outlines.text as text
## Features

- [x] Simple and powerful prompting primitives based on the [Jinja templating engine](https://jinja.palletsprojects.com/).
- [x] Interleave completions with loops, conditionals, and custom Python functions
- [x] Caching of generations
- [x] Integration with OpenAI and HuggingFace models
- [x] Controlled generation, including multiple choice, type constraints and dynamic stopping
- [x] Sampling of multiple sequences

@text.prompt
def few_shot_examples(question, examples):
"""Something something

{% for example in examples %}
EXAMPLE: {{ example }}
{% endfor %}
## Installation

QUESTION: {{ question }}
Let's think step by step.
**Outlines** is available on PyPi:

"""
``` bash
pip install outlines
```

Functions can also be _partially evaluated_ just like any function, which can be useful when building agents:
## Prompting

Writing prompts by concatenating strings in pure Python quickly becomes
cumbersome: the prompt building logic gets entangled with the rest of the
program, and the structure of the rendered prompt is obfuscated.**Outlines**
makes it easier to write and manage prompts by encapsulating templates inside
"template functions".

These functions make it possible to neatly separate the prompt logic from the
general program logic; they can be imported from other modules and libraries.

Template functions require no superfluous abstraction, they use the Jinja2
templating engine to help build complex prompts in a concise manner:

``` python
import functools as ft
import outlines.text as text
import outlines.models as models


examples = [
("The food was digusting", "Negative"),
("We had a fantastic night", "Positive"),
("Recommended", "Positive"),
("The waiter was rude", "Negative")
]

@text.prompt
def my_agent(name, goals):
"""Your name is {{ name }}.
def labelling(to_label, examples):
"""You are a sentiment-labelling assistant.

GOALS:
{% for goal in goals %}
{{ loop.counter }}. {{ goal }}
{% for example in examples %}
{{ example[0] }} // {{ example[1] }}
{% endfor %}
{{ to_label }} //
"""


jarvis = ft.partial(my_agent, "JARVIS")
model = models.text_completion.openai("text-davinci-003")
prompt = labelling("Just awesome", examples)
answer = complete(prompt)
```

The template contained in template functions remains accessible:
## Chaining with loops and conditionals ([example](https://github.com/normal-computing/outlines/blob/readme/examples/react.py))

**Outlines** comes with very few abstractions, and is designed to blend into existing code and integrate with the rest of the ecosystem.

``` python
import outlines.text as text
reviews = ["Just awesome", "Avoid", "Will come back"]

def send_notification(review):
"""This function sends a notification with the review's content."""
...

@text.prompt
def prompt():
"I am accessible"
for review in reviews:
prompt = labelling(review, examples)
answer = model(prompt)
if answer == "Positive":
send_notification(review)
```

## Agents ([example](https://github.com/normal-computing/outlines/blob/readme/examples/babyagi.py))

prompt.template
# I am accessible
```
**Outlines** makes building agents like [AutoGPT](https://github.com/Significant-Gravitas/Auto-GPT), [BabyAGI](https://github.com/yoheinakajima/babyagi), [ViperGPT](https://viper.cs.columbia.edu/) or [Transformers Agent](https://huggingface.co/docs/transformers/transformers_agents) easier by removing boilerplate prompting code.

### Tools

Prior work has shown that we can teach language models to call external functions to get additional informations or perform tasks, by encoding the functions' description in the prompt. To avoid duplicating information between the function definition and the description passed to the prompt we define custom Jinja filters that can extract the function's name, description, signature and source:
We can teach language models to call external functions to get additional informations or perform tasks, by encoding the functions' description in the prompt. To avoid duplicating information between the function definition and the description passed to the prompt, we define custom Jinja filters that can extract the function's name, description, signature and source:


``` python
Expand All @@ -94,7 +121,7 @@ def wikipedia_search(query: str):


@text.prompt
def my_commands(tools: List[Callable]):
def agent(tools: List[Callable]):
"""AVAILABLE COMMANDS:

{% for tool in tools %}
Expand Down Expand Up @@ -141,88 +168,78 @@ joke_ppt(Joke)
# }
```

## Controlled generation

## Natural language functions
The first step towards reliability of systems that include large language models is to ensure that there is a well-defined interface between their output and user-defined code. **Outlines** provides ways to control the generation of language models to make their output more predictable.

Large language models can be prompted so their output can be parsed into a data structure that can be manipulated by programming languages. The combination prompt + model call + output parser can thus be thought as a "natural language" function.
You can stop the generation after a given sequence has been found:

``` python
import json
import outlines.text as text
import outlines.models as models

answer = model("Tell me a one-sentence joke.", stop_at=["."])
```

@text.prompt
def prime_numbers(n: int):
"""Return a list that contains all prime numbers between 1 and {{ n }}.
You can reduce the completion to a choice between multiple possibilities:

The output must be parsable as a Python list.
"""
``` python
prompt = labelling("Just awesome", examples)
answer = model(prompt, is_in=["Positive", "Negative"])
```


def parse(result):
return json.loads(result)
You can require the generated sequence to be an int or a float:

``` python
import outlines.models as models

get_prime_numbers = text.function(
models.text_completion.openai("gpt-3.5-turbo"),
prime_numbers,
parse
)

model = models.text_completion.hf("sshleifer/tiny-gpt2")
answer = model("2 + 2 = ", type="int")
print(answer)
# 4

get_prime_numbers(10)
# [2, 3, 5, 7]
model = models.text_completion.hf("sshleifer/tiny-gpt2")
answer = model("1.7 + 3.2 = ", type="float")
print(answer)
# 4.9
```

For more complex outputs one can pass a Pydantic model to `text.function`, which will be used to parse the output:

``` python
from pydantic import BaseModel
import outlines.text as text
## Sampling ([uncertainty](https://github.com/normal-computing/outlines/blob/readme/examples/sampling.ipynb), [simulation-based inference](https://github.com/normal-computing/outlines/blob/readme/examples/simulation_based_inference.ipynb))

Outlines is strictly sampling based, and focused on using methods such as [self-consistency](https://arxiv.org/abs/2203.11171), [adaptive consistency](https://arxiv.org/abs/2305.11860), [DiVeRSe](https://arxiv.org/abs/2206.02336), [Tree of thoughts](https://arxiv.org/abs/2305.10601), [lattice sampling](https://arxiv.org/abs/2112.07660), etc. Several samples can be obtained using the `num_samples` keyword argument:

class Joke(BaseModel):
joke: str
explanation: str
``` python
import outlines.models as models


@text.prompt
def joke_ppt(response_model):
"""Tell a joke and explain why the joke is funny.
model = models.text_completion.hf("sshleifer/tiny-gpt2")
answer = model("2 + 2 = ", num_samples=5)
print(answer)
# [4, 5, 4, 4, 4]
```

RESPONSE FORMAT:
{{ response_model | schema }}
"""
The focus on sampling allows us to explore different ideas, such as [using the diversity of answers to evaluate the model's uncertainty](https://github.com/normal-computing/outlines/blob/readme/examples/sampling.ipynb), or [simulation-based inference to optimize the prompt](https://github.com/normal-computing/outlines/blob/readme/examples/simulation_based_inference.ipynb).

tell_a_joke = text.function(
models.text_completion.openai("gpt-3.5-turbo"),
joke_ppt,
Joke
)
## Contributing

tell_a_joke(Joke)
# [2, 3, 5, 7]
```
### What contributions?

# Controlled generation
We curently only accept bug fixes and documentation contributions. If you have a feature request, please start a new [discussions](https://github.com/normal-computing/outlines/discussions). The issue tracker is only intended for actionable items.
rlouf marked this conversation as resolved.
Show resolved Hide resolved

Outlines offers mechanisms to specify high-level constraints on the text generations:
### How to contribute?

- `stop_at` allows to stop the generation once a particular word, sequence of symbol had been generated;
- `is_in` allows to constrain the model to generate an answer chosen among a set of possible answers;
- `type` allows to constrain the model's output to either `"int"`s or `"float"`s;
Run `pip install -e .[test]` or `conda env create -f environment.yml`. To build the documentation you will also need to run `pip install -r requirements-doc.txt`.

Coming:
Before pushing your code to repository please run `pre-commit run --all-files` and `pytest` to make sure that the code is formatted correctly and that the tests pass.

- Ability to constrain the output to a JSON with a given structure;
- Ability to constrain the output to a List;
- Ability to constrain the output to be Python code;
Do not hesitate to open a draft PR before your contribution is ready, especially if you have questions and/or need feedback.

# Examples
## Examples

- [Pick the odd one out](https://github.com/normal-computing/outlines/blob/main/examples/pick_odd_one_out.py)
- [Meta prompting](https://github.com/normal-computing/outlines/blob/main/examples/meta_prompting.py)
- [ReAct](https://github.com/normal-computing/outlines/blob/main/examples/meta_prompting.py)
- [Generate code to solve math problems](https://github.com/normal-computing/outlines/blob/main/examples/dust/math-generate-code.py)
- [BabyAGI](https://github.com/normal-computing/outlines/blob/main/examples/babyagi.py)
- [Uncertainty](https://github.com/normal-computing/outlines/blob/readme/examples/sampling.ipynb)
- [Simulation-based inference](https://github.com/normal-computing/outlines/blob/readme/examples/simulation_based_inference.ipynb)
Loading