Skip to content

Commit

Permalink
community[patch]: VoyageAI embedding with input_type parameter (#5493)
Browse files Browse the repository at this point in the history
* added input_type

* input_type set to "none" by default , then choose between document or query

* Updated input_type for Voyage embeddings

* Update voyage.ts

* Update voyageai.mdx

* Update voyageai.mdx

* Update voyage.ts

* Update voyageai.mdx

¯\_(ツ)_/¯

* Update voyage.int.test.ts

With document to make sure it works

* Format

* Update voyageai.mdx

is it the space ? (sorry...)

* Format

* Fix types

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
  • Loading branch information
nicolas-geysse and jacoblee93 committed May 21, 2024
1 parent a4e3bc4 commit 9cbb841
Show file tree
Hide file tree
Showing 3 changed files with 38 additions and 6 deletions.
9 changes: 8 additions & 1 deletion docs/core_docs/docs/integrations/text_embedding/voyageai.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,17 @@

The `VoyageEmbeddings` class uses the Voyage AI REST API to generate embeddings for a given text.

The `inputType` parameter allows you to specify the type of input text for better embedding results. You can set it to `query`, `document`, or leave it undefined (which is equivalent to `None`).

- `query`: Use this for search or retrieval queries. Voyage AI will prepend a prompt to optimize the embeddings for query use cases.
- `document`: Use this for documents or content that you want to be retrievable. Voyage AI will prepend a prompt to optimize the embeddings for document use cases.
- `None` (default): The input text will be directly encoded without any additional prompt.

```typescript
import { VoyageEmbeddings } from "langchain/embeddings/voyage";
import { VoyageEmbeddings } from "@langchain/community/embeddings/voyage";

const embeddings = new VoyageEmbeddings({
apiKey: "YOUR-API-KEY", // In Node.js defaults to process.env.VOYAGEAI_API_KEY
inputType: "document", // Optional: specify input type as 'query', 'document', or omit for None / Undefined / Null
});
```
19 changes: 14 additions & 5 deletions libs/langchain-community/src/embeddings/tests/voyage.int.test.ts
Original file line number Diff line number Diff line change
@@ -1,25 +1,26 @@
import { test, expect } from "@jest/globals";
import { VoyageEmbeddings } from "../voyage.js";

test.skip("Test VoyageEmbeddings.embedQuery", async () => {
const embeddings = new VoyageEmbeddings();
test.skip("Test VoyageEmbeddings.embedQuery with input_type", async () => {
const embeddings = new VoyageEmbeddings({ inputType: "document" });
const res = await embeddings.embedQuery("Hello world");

expect(typeof res[0]).toBe("number");
});

test.skip("Test VoyageEmbeddings.embedDocuments", async () => {
const embeddings = new VoyageEmbeddings();
test.skip("Test VoyageEmbeddings.embedDocuments with input_type", async () => {
const embeddings = new VoyageEmbeddings({ inputType: "document" });
const res = await embeddings.embedDocuments(["Hello world", "Bye bye"]);
expect(res).toHaveLength(2);
expect(typeof res[0][0]).toBe("number");
expect(typeof res[1][0]).toBe("number");
});

test.skip("Test VoyageEmbeddings concurrency", async () => {
test.skip("Test VoyageEmbeddings concurrency with input_type", async () => {
const embeddings = new VoyageEmbeddings({
batchSize: 1,
maxConcurrency: 2,
inputType: "document",
});
const res = await embeddings.embedDocuments([
"Hello world",
Expand All @@ -34,3 +35,11 @@ test.skip("Test VoyageEmbeddings concurrency", async () => {
undefined
);
});

test.skip("Test VoyageEmbeddings without input_type", async () => {
const embeddings = new VoyageEmbeddings();
const res = await embeddings.embedDocuments(["Hello world", "Bye bye"]);
expect(res).toHaveLength(2);
expect(typeof res[0][0]).toBe("number");
expect(typeof res[1][0]).toBe("number");
});
16 changes: 16 additions & 0 deletions libs/langchain-community/src/embeddings/voyage.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@ export interface VoyageEmbeddingsParams extends EmbeddingsParams {
* limited by the Voyage AI API to a maximum of 8.
*/
batchSize?: number;

/**
* Input type for the embeddings request.
*/
inputType?: string;
}

/**
Expand All @@ -32,6 +37,11 @@ export interface CreateVoyageEmbeddingRequest {
* @memberof CreateVoyageEmbeddingRequest
*/
input: string | string[];

/**
* Input type for the embeddings request.
*/
input_type?: string;
}

/**
Expand All @@ -53,6 +63,8 @@ export class VoyageEmbeddings

headers?: Record<string, string>;

inputType?: string;

/**
* Constructor for the VoyageEmbeddings class.
* @param fields - An optional object with properties to configure the instance.
Expand All @@ -61,6 +73,7 @@ export class VoyageEmbeddings
fields?: Partial<VoyageEmbeddingsParams> & {
verbose?: boolean;
apiKey?: string;
inputType?: string; // Make inputType optional
}
) {
const fieldsWithDefaults = { ...fields };
Expand All @@ -78,6 +91,7 @@ export class VoyageEmbeddings
this.batchSize = fieldsWithDefaults?.batchSize ?? this.batchSize;
this.apiKey = apiKey;
this.apiUrl = `${this.basePath}/embeddings`;
this.inputType = fieldsWithDefaults?.inputType;
}

/**
Expand All @@ -92,6 +106,7 @@ export class VoyageEmbeddings
this.embeddingWithRetry({
model: this.modelName,
input: batch,
input_type: this.inputType,
})
);

Expand Down Expand Up @@ -119,6 +134,7 @@ export class VoyageEmbeddings
const { data } = await this.embeddingWithRetry({
model: this.modelName,
input: text,
input_type: this.inputType,
});

return data[0].embedding;
Expand Down

0 comments on commit 9cbb841

Please sign in to comment.