diff --git a/docs/reference/models/mlxlm.md b/docs/reference/models/mlxlm.md index 539e03851..06beb0036 100644 --- a/docs/reference/models/mlxlm.md +++ b/docs/reference/models/mlxlm.md @@ -16,7 +16,7 @@ model = models.mlxlm("mlx-community/mlx-community/Meta-Llama-3-8B-Instruct-8bit" With the loaded model, you can generate text or perform structured generation, e.g. -```python3 +```python from outlines import models, generate model = models.mlxlm("mlx-community/Meta-Llama-3-8B-Instruct-8bit") @@ -28,5 +28,3 @@ model_output = generator("What's Jennys Number?\n") print(model_output) # '8675309' ``` - -For more examples, see the [cookbook](cookbook/index.md). diff --git a/docs/reference/models/openai.md b/docs/reference/models/openai.md index 07357a360..3e2f717e5 100644 --- a/docs/reference/models/openai.md +++ b/docs/reference/models/openai.md @@ -1,69 +1,137 @@ -# Generate text with the OpenAI and compatible APIs +# OpenAI and compatible APIs !!! Installation You need to install the `openai` and `tiktoken` libraries to be able to use the OpenAI API in Outlines. -Outlines supports models available via the OpenAI Chat API, e.g. ChatGPT and GPT-4. The following models can be used with Outlines: +## OpenAI models + +Outlines supports models available via the OpenAI Chat API, e.g. ChatGPT and GPT-4. You can initialize the model by passing the model name to `outlines.models.openai`: ```python from outlines import models + model = models.openai("gpt-3.5-turbo") -model = models.openai("gpt-4") +model = models.openai("gpt-4-turbo") +model = models.openai("gpt-4o") +``` + +Check the [OpenAI documentation](https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4) for an up-to-date list of available models. You can pass any parameter you would pass to `openai.AsyncOpenAI` as keyword arguments: + +```python +import os +from outlines import models + -print(type(model)) -# OpenAI +model = models.openai( + "gpt-3.5-turbo", + api_key=os.environ("OPENAI_API_KEY") +) ``` -Outlines also supports Azure OpenAI models: +The following table enumerates the possible parameters. Refer to the [OpenAI SDK's code](https://github.com/openai/openai-python/blob/54a5911f5215148a0bdeb10e2bcfb84f635a75b9/src/openai/_client.py) for an up-to-date list. + +**Parameters:** + +| **Parameters** | **Type** | **Description** | **Default** | +|----------------|:---------|:----------------|:------------| +| `api_key` | `str` | OpenAI API key. Infered from `OPENAI_API_KEY` if not specified | `None` | +| `organization` | `str` | OpenAI organization id. Infered from `OPENAI_ORG_ID` if not specified | `None` | +| `project` | `str` | OpenAI project id. Infered from `OPENAI_PROJECT_ID` if not specified.| `None` | +| `base_url` | `str | https.URL` | Base URL for the endpoint. Infered from `OPENAI_BASE_URL` if no specified. | `None` | +| `timeout` | `float` | Request timeout.| `NOT_GIVEN` | +| `max_retries` | `int` | Maximum number of retries for failing requests | `2` | +| `default_headers` | `Mapping[str, str]` | Default HTTP headers | `None` | +| `default_query` | `Mapping[str, str]` | Custom parameters added to the HTTP queries | `None` | +| `http_client` | `https.AsyncClient` | User-specified `httpx` client | `None` | +## Azure OpenAI models + +Outlines also supports Azure OpenAI models: ```python from outlines import models + model = models.azure_openai( + "azure-deployment-name", + "gpt-3.5-turbo", api_version="2023-07-01-preview", azure_endpoint="https://example-endpoint.openai.azure.com", ) ``` -More generally, you can use any API client compatible with the OpenAI interface by passing an instance of the client, a configuration, and optionally the corresponding tokenizer (if you want to be able to use `outlines.generate.choice`): +!!! Question "Why do I need to specify model and deployment name?" -```python -from openai import AsyncOpenAI -import tiktoken + The model name is needed to load the correct tokenizer for the model. The tokenizer is necessary for structured generation. -from outlines.models.openai import OpenAI, OpenAIConfig -config = OpenAIConfig(model="gpt-4") -client = AsyncOpenAI() -tokenizer = tiktoken.encoding_for_model("gpt-4") +You can pass any parameter you would pass to `openai.AsyncAzureOpenAI`. You can consult the [OpenAI SDK's code](https://github.com/openai/openai-python/blob/54a5911f5215148a0bdeb10e2bcfb84f635a75b9/src/openai/lib/azure.py) for an up-to-date list. -model = OpenAI(client, config, tokenizer) -``` +**Parameters:** -## Monitoring API use +| **Parameters** | **Type** | **Description** | **Default** | +|----------------|:---------|:----------------|:------------| +| `azure_endpoint` | `str` | Azure endpoint, including the resource. Infered from `AZURE_OPENAI_ENDPOINT` if not specified | `None` | +| `api_version` | `str` | API version. Infered from `AZURE_OPENAI_API_KEY` if not specified | `None` | +| `api_key` | `str` | OpenAI API key. Infered from `OPENAI_API_KEY` if not specified | `None` | +| `azure_ad_token` | `str` | Azure active directory token. Inference from `AZURE_OPENAI_AD_TOKEN` if not specified | `None` | +| `azure_ad_token_provider` | `AzureADTokenProvider` | A function that returns an Azure Active Directory token | `None` | +| `organization` | `str` | OpenAI organization id. Infered from `OPENAI_ORG_ID` if not specified | `None` | +| `project` | `str` | OpenAI project id. Infered from `OPENAI_PROJECT_ID` if not specified.| `None` | +| `base_url` | `str | https.URL` | Base URL for the endpoint. Infered from `OPENAI_BASE_URL` if not specified. | `None` | +| `timeout` | `float` | Request timeout.| `NOT_GIVEN` | +| `max_retries` | `int` | Maximum number of retries for failing requests | `2` | +| `default_headers` | `Mapping[str, str]` | Default HTTP headers | `None` | +| `default_query` | `Mapping[str, str]` | Custom parameters added to the HTTP queries | `None` | +| `http_client` | `https.AsyncClient` | User-specified `httpx` client | `None` | -It is important to be able to track your API usage when working with OpenAI's API. The number of prompt tokens and completion tokens is directly accessible via the model instance: +## Models that follow the OpenAI standard -```python -import outlines.models +Outlines supports models that follow the OpenAI standard. You will need to initialize the OpenAI client properly configured and pass it to `outlines.models.openai` -model = models.openai("gpt-4") +```python +import os +from openai import AsyncOpenAI +from outlines import models +from outlines.models.openai import OpenAIConfig -print(model.prompt_tokens) -# 0 -print(model.completion_tokens) -# 0 +client = AsyncOpenAI( + api_key=os.environ.get("PROVIDER_KEY"), + base_url="http://other.provider.server.com" +) +config = OpenAIConfig("model_name") +model = models.openai(client, config) ``` -These numbers are updated every time you call the model. +!!! Warning + + You need to pass the async client to be able to do batch inference. + +## Advanced configuration + +For more advanced configuration option, such as support proxy, please consult the [OpenAI SDK's documentation](https://github.com/openai/openai-python): + + +```python +from openai import AsyncOpenAI, DefaultHttpxClient +from outlines import models +from outlines.models.openai import OpenAIConfig -## Advanced usage +client = AsyncOpenAI( + base_url="http://my.test.server.example.com:8083", + http_client=DefaultHttpxClient( + proxies="http://my.test.proxy.example.com", + transport=httpx.HTTPTransport(local_address="0.0.0.0"), + ), +) +config = OpenAIConfig("model_name") +model = models.openai(client, config) +``` It is possible to specify the values for `seed`, `presence_penalty`, `frequence_penalty`, `top_p` by passing an instance of `OpenAIConfig` when initializing the model: @@ -71,11 +139,32 @@ It is possible to specify the values for `seed`, `presence_penalty`, `frequence_ from outlines.models.openai import OpenAIConfig from outlines import models + config = OpenAIConfig( presence_penalty=1., - frequence_penalty=1., + frequency_penalty=1., top_p=.95, seed=0, ) -model = models.openai("gpt-4", config=config) +model = models.openai("gpt-3.5-turbo", config) +``` + +## Monitoring API use + +It is important to be able to track your API usage when working with OpenAI's API. The number of prompt tokens and completion tokens is directly accessible via the model instance: + +```python +from openai import AsyncOpenAI +import outlines.models + + +model = models.openai("gpt-4") + +print(model.prompt_tokens) +# 0 + +print(model.completion_tokens) +# 0 ``` + +These numbers are updated every time you call the model. diff --git a/mkdocs.yml b/mkdocs.yml index df60430c2..7429ad6a0 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -122,15 +122,16 @@ nav: - Prompt templating: reference/prompting.md - Outlines functions: reference/functions.md - Models: - - vLLM: reference/models/vllm.md - - Llama.cpp: reference/models/llamacpp.md - - Transformers: reference/models/transformers.md - - MLX: reference/models/mlxlm.md - - ExllamaV2: reference/models/exllamav2.md - - Mamba: reference/models/mamba.md - - OpenAI: reference/models/openai.md - - TGI: reference/models/tgi.md - + - Open source: + - Transformers: reference/models/transformers.md + - Llama.cpp: reference/models/llamacpp.md + - vLLM: reference/models/vllm.md + - TGI: reference/models/tgi.md + - ExllamaV2: reference/models/exllamav2.md + - MLX: reference/models/mlxlm.md + - Mamba: reference/models/mamba.md + - API: + - OpenAI: reference/models/openai.md - API Reference: - api/index.md - api/models.md