Skip to content

Commit

Permalink
Add support for AzureOpenAI (#131)
Browse files Browse the repository at this point in the history
**Objective**

Added a new TextVectorizer for the AzureOpenAI API, which is based on
the AzureOpenAI and AsyncAzureOpenAI classes from openai>=1.0.0.

**Reason**

Compatibility with Azure OpenAI is an important feature for developers
that are building enterprise AI applications with Azure Cloud.
Particularly for use cases where privacy is a concern, and data must
stay within their cloud tenant.

My team particularly wants to integrate Semantic Cache (with Azure Cache
for Redis Enterprise). Since we rely on the Azure OpenAI API, I created
this new vectorizer to add that functionality.

---------

Co-authored-by: Anibal <a8065384@banorte.com>
Co-authored-by: Tyler Hutcherson <tyler.hutcherson@redis.com>
  • Loading branch information
3 people authored and justin-cechmanek committed May 6, 2024
1 parent 4e432a4 commit 998d4ec
Show file tree
Hide file tree
Showing 5 changed files with 422 additions and 2 deletions.
8 changes: 8 additions & 0 deletions conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,14 @@ def client(redis_url):
def openai_key():
return os.getenv("OPENAI_API_KEY")

@pytest.fixture
def openai_version():
return os.getenv("OPENAI_API_VERSION")

@pytest.fixture
def azure_endpoint():
return os.getenv("AZURE_OPENAI_ENDPOINT")

@pytest.fixture
def cohere_key():
return os.getenv("COHERE_API_KEY")
Expand Down
113 changes: 111 additions & 2 deletions docs/user_guide/vectorizers_04.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -176,6 +176,115 @@
"print(\"Number of Embeddings:\", len(embeddings))\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Azure OpenAI\n",
"\n",
"The ``AzureOpenAITextVectorizer`` is a variation of the OpenAI vectorizer that calls OpenAI models within Azure. If you've already installed ``openai``, then you're ready to use Azure OpenAI.\n",
"\n",
"The only practical difference between OpenAI and Azure OpenAI is the variables required to call the API."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"# additionally to the API Key, setup the API endpoint and version\n",
"api_version = os.environ.get(\"OPENAI_API_VERSION\") or getpass.getpass(\"Enter your AzureOpenAI API version: \")\n",
"azure_endpoint = os.environ.get(\"AZURE_OPENAI_ENDPOINT\") or getpass.getpass(\"Enter your AzureOpenAI API endpoint: \")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Vector dimensions: 1536\n"
]
},
{
"data": {
"text/plain": [
"[-0.0010088568087667227,\n",
" -0.003142790636047721,\n",
" 0.0024922797456383705,\n",
" -0.004522906616330147,\n",
" -0.010369433090090752,\n",
" 0.012739036232233047,\n",
" -0.005365503951907158,\n",
" -0.0029668458737432957,\n",
" -0.007141091860830784,\n",
" -0.03383301943540573]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from redisvl.utils.vectorize import AzureOpenAITextVectorizer\n",
"\n",
"# create a vectorizer\n",
"az_oai = AzureOpenAITextVectorizer(\n",
" model=\"text-embedding-ada-002\", # Must be your custom deployment name\n",
" api_config={\n",
" \"api_key\": api_key,\n",
" \"api_version\": api_version,\n",
" \"azure_endpoint\": azure_endpoint\n",
" },\n",
")\n",
"\n",
"test = az_oai.embed(\"This is a test sentence.\")\n",
"print(\"Vector dimensions: \", len(test))\n",
"test[:10]"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[-0.017460526898503304,\n",
" -6.895032856846228e-05,\n",
" 0.0013909287517890334,\n",
" -0.025688467547297478,\n",
" -0.019813183695077896,\n",
" 0.016087085008621216,\n",
" -0.003729278687387705,\n",
" 0.0009211922879330814,\n",
" 0.006606514099985361,\n",
" -0.025128915905952454]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Just like OpenAI, AzureOpenAI supports batching embeddings and asynchronous requests.\n",
"sentences = [\n",
" \"That is a happy dog\",\n",
" \"That is a happy person\",\n",
" \"Today is a sunny day\"\n",
"]\n",
"\n",
"embeddings = await az_oai.aembed_many(sentences)\n",
"embeddings[0][:10]"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -547,7 +656,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
"version": "3.11.5"
},
"orig_nbformat": 4,
"vscode": {
Expand Down
2 changes: 2 additions & 0 deletions redisvl/utils/vectorize/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from redisvl.utils.vectorize.base import BaseVectorizer
from redisvl.utils.vectorize.text.azureopenai import AzureOpenAITextVectorizer
from redisvl.utils.vectorize.text.cohere import CohereTextVectorizer
from redisvl.utils.vectorize.text.huggingface import HFTextVectorizer
from redisvl.utils.vectorize.text.openai import OpenAITextVectorizer
Expand All @@ -10,4 +11,5 @@
"HFTextVectorizer",
"OpenAITextVectorizer",
"VertexAITextVectorizer",
"AzureOpenAITextVectorizer",
]
Loading

0 comments on commit 998d4ec

Please sign in to comment.