Skip to content

Commit

Permalink
Merge pull request #597 from guardrails-ai/list-support
Browse files Browse the repository at this point in the history
Top Level List Support
  • Loading branch information
zsimjee authored Mar 5, 2024
2 parents b0b233b + e05040a commit 5ce1d29
Show file tree
Hide file tree
Showing 22 changed files with 549 additions and 56 deletions.
219 changes: 219 additions & 0 deletions docs/how_to_guides/structured_data_with_guardrails.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

# Generate structured data with Guardrails AI

Guardrails AI is effective for generating structured data across from a variety of LLMs. This guide contains
the following:
1. General instructions on generating structured data from Guardrails using `Pydantic` or Markup (i.e. `RAIL`), and
2. Examples to generate structured data using `Pydantic` or Markup.

## Syntax for generating structured data

There are two ways to generate structured data with Guardrails AI: using `Pydantic` or Markup (i.e. `RAIL`).

1. **Pydantic**: In order to generate structured data with Pydantic models, create a Pydantic model with the desired fields and types, then create a `Guard` object that uses the Pydantic model to generate structured data, and finally call the LLM of your choice with the `guard` object to generate structured data.
2. **RAIL**: In order to generate structured data with RAIL specs, create a RAIL spec with the desired fields and types, then create a `Guard` object that uses the RAIL spec to generate structured data, and finally call the LLM of your choice with the `guard` object to generate structured data.

Below is the syntax for generating structured data with Guardrails AI using `Pydantic` or Markup (i.e. `RAIL`).

<Tabs>
<TabItem value="pydantic" label="Pydantic" default>
In order to generate structured data, first create a Pydantic model with the desired fields and types.
```python
from pydantic import BaseModel

class Person(BaseModel):
name: str
age: int
is_employed: bool
```

Then, create a `Guard` object that uses the Pydantic model to generate structured data.
```python
from guardrails import Guard

guard = Guard.from_pydantic(Person)
```

Finally, call the LLM of your choice with the `guard` object to generate structured data.
```python
import openai

res = guard(
openai.chat.completion.create,
model="gpt-3.5-turbo",
)
```
</TabItem>
<TabItem value="rail" label="RAIL">
In order to generate structured data, first create a RAIL spec with the desired fields and types.
```xml
<rail version="0.1">
<output>
<string name="name" />
<integer name="age" />
<boolean name="is_employed" />
</output>
```

Then, create a `Guard` object that uses the RAIL spec to generate structured data.
```python
from guardrails import Guard

guard = Guard.from_s("""
<rail version="0.1">
<output>
<string name="name" />
<integer name="age" />
<boolean name="is_employed" />
</output>
</rail>
""")
```

Finally, call the LLM of your choice with the `guard` object to generate structured data.
```python
import openai

res = guard(
openai.chat.completion.create,
model="gpt-3.5-turbo",
)
```
</TabItem>
</Tabs>

## Generate a JSON object with simple types

<Tabs>
<TabItem value="json" label="JSON" default>
```json
{
"name": "John Doe",
"age": 30,
"is_employed": true
}
```
</TabItem>
<TabItem value="pydantic" label="Pydantic">
```python
from pydantic import BaseModel

class Person(BaseModel):
name: str
age: int
is_employed: bool
```
</TabItem>
<TabItem value="rail" label="Markup">
```xml
<rail version="0.1">
<output>
<string name="name" />
<integer name="age" />
<boolean name="is_employed" />
</output>
</rail>
```
</TabItem>
</Tabs>


## Generate a dictionary of nested types

<Tabs>
<TabItem value="json" label="JSON" default>
```json
{
"name": "John Doe",
"age": 30,
"is_employed": true,
"address": {
"street": "123 Main St",
"city": "Anytown",
"zip": "12345"
}
}
```
</TabItem>
<TabItem value="pydantic" label="Pydantic">
```python
from pydantic import BaseModel

class Address(BaseModel):
street: str
city: str
zip: str

class Person(BaseModel):
name: str
age: int
is_employed: bool
address: Address
```
</TabItem>
<TabItem value="rail" label="Markup">
```xml
<rail version="0.1">
<output>
<string name="name" />
<integer name="age" />
<boolean name="is_employed" />
<object name="address">
<string name="street" />
<string name="city" />
<string name="zip" />
</object>
</output>
</rail>
```
</TabItem>
</Tabs>


## Generate a list of types

<Tabs>
<TabItem value="json" label="JSON" default>
```json
[
{
"name": "John Doe",
"age": 30,
"is_employed": true
},
{
"name": "Jane Smith",
"age": 25,
"is_employed": false
}
]
```
</TabItem>
<TabItem value="pydantic" label="Pydantic">
```python
from pydantic import BaseModel

class Person(BaseModel):
name: str
age: int
is_employed: bool

people = list[Person]
```
</TabItem>
<TabItem value="rail" label="Markup">
```xml
<rail version="0.1">
<output type="list">
<object>
<string name="name" />
<integer name="age" />
<boolean name="is_employed" />
</object>
</output>
</rail>
```
</TabItem>
</Tabs>
2 changes: 1 addition & 1 deletion docusaurus/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ const sidebars = {
type: "category",
label: "How-to Guides",
collapsed: true,
items: ["how_to_guides/logs", "how_to_guides/streaming", "how_to_guides/llm_api_wrappers", "how_to_guides/rail", "how_to_guides/envvars" ],
items: ["how_to_guides/logs", "how_to_guides/streaming", "how_to_guides/llm_api_wrappers", "how_to_guides/rail", "how_to_guides/envvars", "how_to_guides/structured_data_with_guardrails" ],
},
"the_guard",
{
Expand Down
3 changes: 3 additions & 0 deletions guardrails/classes/history/call.py
Original file line number Diff line number Diff line change
Expand Up @@ -366,3 +366,6 @@ def tree(self) -> Tree:
)

return tree

def __str__(self) -> str:
return pretty_repr(self)
3 changes: 3 additions & 0 deletions guardrails/classes/history/iteration.py
Original file line number Diff line number Diff line change
Expand Up @@ -189,3 +189,6 @@ def create_msg_history_table(
style="on #F0FFF0",
),
)

def __str__(self) -> str:
return pretty_repr(self)
4 changes: 2 additions & 2 deletions guardrails/classes/output_type.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
from typing import Dict, TypeVar
from typing import Dict, List, TypeVar

OT = TypeVar("OT", str, Dict)
OT = TypeVar("OT", str, List, Dict)
6 changes: 5 additions & 1 deletion guardrails/classes/validation_outcome.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from typing import Generic, Iterator, Optional, Tuple, Union, cast

from pydantic import Field
from rich.pretty import pretty_repr

from guardrails.classes.history import Call, Iteration
from guardrails.classes.output_type import OT
Expand All @@ -9,7 +10,7 @@
from guardrails.utils.reask_utils import ReAsk


class ValidationOutcome(Generic[OT], ArbitraryModel):
class ValidationOutcome(ArbitraryModel, Generic[OT]):
raw_llm_output: Optional[str] = Field(
description="The raw, unchanged output from the LLM call.", default=None
)
Expand Down Expand Up @@ -83,3 +84,6 @@ def __iter__(
def __getitem__(self, keys):
"""Get a subset of the ValidationOutcome's fields."""
return iter(getattr(self, k) for k in keys)

def __str__(self) -> str:
return pretty_repr(self)
4 changes: 2 additions & 2 deletions guardrails/cli/validate.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
import json
from typing import Dict, Union
from typing import Dict, List, Union

import typer

from guardrails import Guard
from guardrails.cli.guardrails import guardrails


def validate_llm_output(rail: str, llm_output: str) -> Union[str, Dict, None]:
def validate_llm_output(rail: str, llm_output: str) -> Union[str, Dict, List, None]:
"""Validate guardrails.yml file."""
guard = Guard.from_rail(rail)
result = guard.parse(llm_output)
Expand Down
18 changes: 16 additions & 2 deletions guardrails/guard.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,9 @@ def __init__(
self,
rail: Optional[Rail] = None,
num_reasks: Optional[int] = None,
base_model: Optional[Type[BaseModel]] = None,
base_model: Optional[
Union[Type[BaseModel], Type[List[Type[BaseModel]]]]
] = None,
tracer: Optional[Tracer] = None,
):
"""Initialize the Guard with optional Rail instance, num_reasks, and
Expand Down Expand Up @@ -220,6 +222,10 @@ def from_rail(
return cast(
Guard[str], cls(rail=rail, num_reasks=num_reasks, tracer=tracer)
)
elif rail.output_type == "list":
return cast(
Guard[List], cls(rail=rail, num_reasks=num_reasks, tracer=tracer)
)
return cast(Guard[Dict], cls(rail=rail, num_reasks=num_reasks, tracer=tracer))

@classmethod
Expand Down Expand Up @@ -247,12 +253,16 @@ def from_rail_string(
return cast(
Guard[str], cls(rail=rail, num_reasks=num_reasks, tracer=tracer)
)
elif rail.output_type == "list":
return cast(
Guard[List], cls(rail=rail, num_reasks=num_reasks, tracer=tracer)
)
return cast(Guard[Dict], cls(rail=rail, num_reasks=num_reasks, tracer=tracer))

@classmethod
def from_pydantic(
cls,
output_class: Type[BaseModel],
output_class: Union[Type[BaseModel], Type[List[Type[BaseModel]]]],
prompt: Optional[str] = None,
instructions: Optional[str] = None,
num_reasks: Optional[int] = None,
Expand All @@ -272,6 +282,10 @@ def from_pydantic(
reask_prompt=reask_prompt,
reask_instructions=reask_instructions,
)
if rail.output_type == "list":
return cast(
Guard[List], cls(rail, num_reasks=num_reasks, base_model=output_class)
)
return cast(
Guard[Dict],
cls(rail, num_reasks=num_reasks, base_model=output_class, tracer=tracer),
Expand Down
Loading

0 comments on commit 5ce1d29

Please sign in to comment.