Skip to content

Commit

Permalink
Add phone number and zip code custom types
Browse files Browse the repository at this point in the history
  • Loading branch information
rlouf committed Apr 30, 2024
1 parent 078f822 commit 2a074fe
Show file tree
Hide file tree
Showing 8 changed files with 101 additions and 16 deletions.
22 changes: 22 additions & 0 deletions docs/reference/format.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Type constraints

We can ask completions to be restricted to valid python types:

```python
from outlines import models, generate

model = models.transformers("mistralai/Mistral-7B-v0.1")
generator = generate.format(model, int)
answer = generator("When I was 6 my sister was half my age. Now I’m 70 how old is my sister?")
print(answer)
# 67
```

The following types are currently available:

- int
- float
- bool
- datetime.date
- datetime.time
- datetime.datetime
39 changes: 24 additions & 15 deletions docs/reference/types.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,31 @@
# Type constraints
# Custom types

We can ask completions to be restricted to valid python types:
Outlines provides custom Pydantic types so you can focus on your use case rather than on writing regular expressions:

- Using `outlines.types.ZipCode` will generate valid US Zip(+4) codes.
- Using `outlines.types.PhoneNumber` will generate valid US phone numbers.

You can use these types in Pydantic schemas for JSON-structured generation:

```python
from outlines import models, generate
from pydantic import BaseModel

from outlines import models, generate, types


class Client(BaseModel):
name: str
phone_number: types.PhoneNumber
zip_code: types.ZipCode


model = models.transformers("mistralai/Mistral-7B-v0.1")
generator = generate.format(model, int)
answer = generator("When I was 6 my sister was half my age. Now I’m 70 how old is my sister?")
print(answer)
# 67
generator = generate.json(model, Client)
result = generator(
"Create a client profile with the fields name, phone_number and zip_code"
)
print(result)
# User(name="John", last_name="Doe", id=11)
```

The following types are currently available:

- int
- float
- bool
- datetime.date
- datetime.time
- datetime.datetime
We plan on adding many more custom types. If you have found yourself writing regular expressions to generate fields of a given type, or if you could benefit from more specific types don't hesite to [submit a PR](https://github.com/outlines-dev/outlines/pulls) or [open an issue](https://github.com/outlines-dev/outlines/issues/new/choose).
3 changes: 2 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -109,13 +109,14 @@ nav:
- Structured generation:
- Classification: reference/choices.md
- Regex: reference/regex.md
- Type constraints: reference/types.md
- Type constraints: reference/format.md
- JSON (function calling): reference/json.md
- JSON mode: reference/json_mode.md
- Grammar: reference/cfg.md
- Custom FSM operations: reference/custom_fsm_ops.md
- Utilities:
- Serve with vLLM: reference/serve/vllm.md
- Custom types: reference/types.md
- Prompt templating: reference/prompting.md
- Outlines functions: reference/functions.md
- Models:
Expand Down
1 change: 1 addition & 0 deletions outlines/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import outlines.generate
import outlines.grammars
import outlines.models
import outlines.types
from outlines.base import vectorize
from outlines.caching import clear_cache, disable_cache, get_cache
from outlines.function import Function
Expand Down
2 changes: 2 additions & 0 deletions outlines/types/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
from .phone_numbers import PhoneNumber
from .zip_codes import ZipCode
16 changes: 16 additions & 0 deletions outlines/types/phone_numbers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
"""Phone number types.
We currently only support US phone numbers. We can however imagine having custom types
for each country, for instance leveraging the `phonenumbers` library.
"""
from pydantic import WithJsonSchema
from typing_extensions import Annotated

US_PHONE_NUMBER = r"(\([0-9]{3}\) |[0-9]{3}-)[0-9]{3}-[0-9]{4}"


PhoneNumber = Annotated[
str,
WithJsonSchema({"type": "string", "regex": US_PHONE_NUMBER}),
]
13 changes: 13 additions & 0 deletions outlines/types/zip_codes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
"""Zip code types.
We currently only support US Zip Codes.
"""
from pydantic import WithJsonSchema
from typing_extensions import Annotated

# This matches Zip and Zip+4 codes
US_ZIP_CODE = (r"\b\d{5}(?:-\d{4})?\b",)


ZipCode = Annotated[str, WithJsonSchema({"type": "string", "regex": US_ZIP_CODE})]
21 changes: 21 additions & 0 deletions tests/types.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
from pydantic import BaseModel

from outlines import types


def test_phone_number():
class User(BaseModel):
number: types.PhoneNumber

schema = User.model_json_schema()
assert "regex" in schema["properties"]["number"]
assert schema["properties"]["number"]["type"] == "string"


def test_zip_code():
class User(BaseModel):
zip_code: types.ZipCode

schema = User.model_json_schema()
assert "regex" in schema["properties"]["zip_code"]
assert schema["properties"]["zip_code"]["type"] == "string"

0 comments on commit 2a074fe

Please sign in to comment.