fix function calling schema for pydantic v2 #616

CalebCourier · 2024-03-07T21:04:08Z

Pass the model instead of type(model) when reducing for function calling.

The previous implementation yielded an empty dictionary for the annotation which some LLM's might take to mean they should return an empty object.

ShreyaR · 2024-03-07T22:20:05Z

@CalebCourier can you add a test to repro the original failure?

CalebCourier · 2024-03-07T22:25:47Z

Another thing I still need to fix is the types used in llm_providers and the pydantic utils are wrong. The base_model parameter we're passing around there is currently typed as a BaseModel but it's actually a Type[BaseModel] or potentially aType[List[Type[BaseModel]]] with the new top level array support.

The obvious task here is to update those types, but in addition to that, if the type is a List[BaseModel] we'll need to wrap the BaseModel's schema with a json schema that specifies and Array parent.

CalebCourier · 2024-03-07T22:29:49Z

@CalebCourier can you add a test to repro the original failure?

@ShreyaR I can show that the schema included in the functions kwarg was empty before where as now it contains the schema of the model. The closest I got to repro'ing the original failure was the llm would return less information when the empty schema was passed.

This PR is also still in draft because I'm waiting the hear back on if this solved the issue the end user was experiencing. If it doesn't then there might be more to the problem.

CalebCourier · 2024-03-08T15:22:44Z

@CalebCourier can you add a test to repro the original failure?

@ShreyaR I added tests that show the current behaviour as well as the empty schema issue I mention in the PR description.

CalebCourier · 2024-03-08T18:23:18Z

Ok, I was able to replicate the LLM returning an empty object when these changes are not in place (i.e. 0.4.1) if I remove the complete_json_suffix from the prompt. See below:

import guardrails as gd
import openai

from rich import print
from typing import List, Optional
from pydantic import BaseModel, Field
from guardrails.validators import ValidLength


openai.api_type = ""
openai.api_version = ""
openai.api_key = ""
openai.azure_endpoint = ""

class Founder(BaseModel):
    first: str = Field(
        description="First name of the founder"
    )
    last: str = Field(
        description="last name of the founder"
    )

class Customer(BaseModel):
    customer_persona: Optional[str] = Field(
        description="who is the customer that this organization is targeting (e.g. 'dental-practices', 'consumers', 'health plans')"
    )
    targets_healthcare_orgs: Optional[str] = Field(
        description="Does this organization target healthcare providers or healthcare practices? Respond with 'True', 'False', or 'Unknown'"
    )
    targets_healthcare_reasoning: Optional[str] = Field(
        description="Please provider a brief explanation of why you believe this organization does or does not target healthcare providers"
    )
    description: Optional[str] = Field(
        description="Provider a brief description of the organizaiton in 80 characters or less",
        validators=[ValidLength(min=0, max=10000)]
    )
    founders: Optional[List[Founder]] = Field(
        description="List of all the founders of the organization"
    )
    organization_type: Optional[str] = Field(
        description="What type of organization is this? (e.g. 'startup', 'consultancy', 'non-profit')"
    )

prompt = """
I will provide you with some raw source data. I want you to read my instructions below and extract the information requested below.

**INSTRUCTIONS**

I am going to provide you with some information about an organization called Harbor Health.
I want you to read the information and fill out the responses to the best of your ability.

**SOURCE DATA**

Description: Harbor Health is a multi-specialy clinic group that providers smarter health care using technology.  The company co-creates a health path that is dedicated to knowing individuals health goals with the guidance of specialist when needed.  Harbor Health was foudned in 2022 and is based in Austin Texas.
Founders:
"""
# ${gr.complete_json_suffix}

instructions = """
You are a helpful assistant, able to express yourself purely through JSON, strictly and precisely adhering to the provided XML schemas.
When you return your response, do not wrap your answer in a code block such as or any other formatting such as '''json, or ''
If you do not know the answer, return an empty dictionary, '{}'
"""
guard = gd.Guard.from_pydantic(
    output_class=Customer,
    instructions=instructions,
    prompt=prompt,
)

res = None

try:
    res = guard(
        openai.chat.completions.create,
        # prompt_params=params,
        # model=model.value,
        model="gpt-3.5-turbo",
        temperature=0,
    )
except Exception as e:
    # todo
    print(e)
    raise e

if guard.history.last is not None:
    print(guard.history.last.tree)

edisontim · 2024-03-09T16:22:27Z

Hey, encountered the same issue on my end where OpenAI was returning an empt dict, can provide an example with the complete_json_suffix if that helps. Wondering how much time this will take to be merged and shipped as it's a big blocking point for us :) ?

Also getting this error when trying with this branch:
guardrails.llm_providers.PromptCallableException: The callable `fn` passed to `Guard(fn, ...)` failed with the following error: `Unable to serialize unknown type: <class 'guardrails.validators.valid_range.ValidRange'>`. Make sure that `fn` can be called as a function that takes in a single prompt string and returns a string.
Think it's because I'm using validators for the Pydantic Models

CalebCourier · 2024-03-11T16:57:51Z

@edisontim if you could share your example where OpenAI returns empty dict that includes the complete_json_suffix that would be very helpful thank you!

I'll try to track down the serializing error with validators on pydantic models. For that, which version of pydantic are you using?

ShreyaR

lgtm!

guardrails/utils/pydantic_utils/v1.py

edisontim · 2024-03-12T13:26:54Z

@edisontim if you could share your example where OpenAI returns empty dict that includes the complete_json_suffix that would be very helpful thank you!

I'll try to track down the serializing error with validators on pydantic models. For that, which version of pydantic are you using?

Hey! Sure here's the Models I'm using with pydantic v2.6.3

class Characteristics(BaseModel):
    age: int = Field(description="Age of the character", validators=ValidRange(min=15, max=65))
    role: int = Field(
        description=f"Job of the NPC. {roles_str}",
        validators=[ValidChoices(choices=[index for index, role in enumerate(ROLES)], on_fail="reask")],
    )
    sex: int = Field(
        description="Sex of the NPC. (0 for male, 1 for female)",
        validators=[ValidChoices(choices=[0, 1], on_fail="reask")],
    )


class Npc(BaseModel):
    character_trait: str = Field(
        description="Trait of character that defines the NPC. One word max.",
        validators=[
            ValidLength(min=5, max=31, on_fail="fix"),
        ],
    )
    full_name: str = Field(
        description='First name and last name of the NPC. Don\'t use words in the name such as "Wood"',
        validators=[ValidLength(min=5, max=31), TwoWords(on_fail="reask")],
    )

    description: str = Field(
        description="Description of the NPC",
    )
    characteristics: Characteristics = Field(description="Various characteristics")

and using ${gr.complete_json_suffix_v2} in my prompt. I init the guard like this

    npc_profile_guard: Guard = Guard.from_pydantic(
        output_class=Npc,
        instructions="Blabla prompt. ${gr.complete_json_suffix_v2}",
        num_reasks=2,
    )

and calling like such:

        _raw_llm_response, validated_response, *_rest = self.npc_profile_guard(
            openai.chat.completions.create,
            model=ChatCompletionModel.GPT_3_5_TURBO.value,
            temperature=1,
            prompt="Generate an NPC",
        )

CalebCourier · 2024-03-12T13:49:33Z

@edisontim I don't see any validators in your model. Would you mind sharing where the ValidRange validator is applied that caused the serialization error?

edisontim · 2024-03-12T13:54:39Z

@edisontim I don't see any validators in your model. Would you mind sharing where the ValidRange validator is applied that caused the serialization error?

My bad, updated in the original comment

CalebCourier · 2024-03-12T14:11:48Z

Ok, able to replicate the serialization issue with your example, thanks @edisontim. Think I have a path forward for resolution.

CalebCourier · 2024-03-13T15:41:46Z

@edisontim I just pushed a commit that should fix the serialization issue for pydantic 2.x. Note though, that pydantic 1.x is still broken.

guardrails/validator_base.py

guardrails/utils/dataclass.py

guardrails/utils/pydantic_utils/v1.py

fix function calling schema for pydantic v2

8fe2ec5

CalebCourier added 2 commits March 8, 2024 09:20

add tests

43d6db1

Merge branch 'main' into fix-v2-func-call

d7f0893

add list typing and tests

dedee9b

CalebCourier marked this pull request as ready for review March 8, 2024 18:24

CalebCourier requested review from zsimjee, thekaranacharya and ShreyaR March 8, 2024 18:24

Merge branch 'main' into fix-v2-func-call

6532463

ShreyaR previously approved these changes Mar 12, 2024

View reviewed changes

guardrails/utils/pydantic_utils/v1.py Outdated Show resolved Hide resolved

refactor _create_bare_model to shared method

41223d5

CalebCourier dismissed ShreyaR’s stale review via 41223d5 March 12, 2024 13:21

fix serialization for pydantic 2

940db4d

CalebCourier added 5 commits March 13, 2024 13:45

turn off function calling for pydantic 1.x

49b38d5

Merge branch 'main' into fix-v2-func-call

c4470a1

remove common since it's no longer needed

96d3e54

remove test for deleted file

681b22a

update tests

089f3b5

CalebCourier added 2 commits March 13, 2024 17:13

stub dataclass for pydantic 1.x

950f579

lint fix

cf3952e

CalebCourier commented Mar 13, 2024

View reviewed changes

guardrails/validator_base.py Show resolved Hide resolved

CalebCourier commented Mar 13, 2024

View reviewed changes

guardrails/utils/dataclass.py Show resolved Hide resolved

CalebCourier commented Mar 13, 2024

View reviewed changes

guardrails/utils/pydantic_utils/v1.py Show resolved Hide resolved

title List -> Array

82500da

zsimjee approved these changes Mar 14, 2024

View reviewed changes

CalebCourier merged commit 2aeb9dd into main Mar 14, 2024
36 checks passed

ShreyaR deleted the fix-v2-func-call branch March 29, 2024 01:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix function calling schema for pydantic v2 #616

fix function calling schema for pydantic v2 #616

CalebCourier commented Mar 7, 2024 •

edited

Loading

ShreyaR commented Mar 7, 2024

CalebCourier commented Mar 7, 2024 •

edited

Loading

CalebCourier commented Mar 7, 2024

CalebCourier commented Mar 8, 2024

CalebCourier commented Mar 8, 2024

edisontim commented Mar 9, 2024 •

edited

Loading

CalebCourier commented Mar 11, 2024

ShreyaR left a comment

edisontim commented Mar 12, 2024 •

edited

Loading

CalebCourier commented Mar 12, 2024

edisontim commented Mar 12, 2024

CalebCourier commented Mar 12, 2024

CalebCourier commented Mar 13, 2024

fix function calling schema for pydantic v2 #616

fix function calling schema for pydantic v2 #616

Conversation

CalebCourier commented Mar 7, 2024 • edited Loading

ShreyaR commented Mar 7, 2024

CalebCourier commented Mar 7, 2024 • edited Loading

CalebCourier commented Mar 7, 2024

CalebCourier commented Mar 8, 2024

CalebCourier commented Mar 8, 2024

edisontim commented Mar 9, 2024 • edited Loading

CalebCourier commented Mar 11, 2024

ShreyaR left a comment

Choose a reason for hiding this comment

edisontim commented Mar 12, 2024 • edited Loading

CalebCourier commented Mar 12, 2024

edisontim commented Mar 12, 2024

CalebCourier commented Mar 12, 2024

CalebCourier commented Mar 13, 2024

CalebCourier commented Mar 7, 2024 •

edited

Loading

CalebCourier commented Mar 7, 2024 •

edited

Loading

edisontim commented Mar 9, 2024 •

edited

Loading

edisontim commented Mar 12, 2024 •

edited

Loading