Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs[patch]: Update storage docs #6280

Merged
merged 1 commit into from
Jul 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions docs/core_docs/docs/concepts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -513,6 +513,29 @@ Retrievers accept a string query as input and return an array of `Document`s as

For specifics on how to use retrievers, see the [relevant how-to guides here](/docs/how_to/#retrievers).

### Key-value stores

For some techniques, such as [indexing and retrieval with multiple vectors per document](/docs/how_to/multi_vector/), having some sort of key-value (KV) storage is helpful.

LangChain includes a [`BaseStore`](https://api.js.langchain.com/classes/langchain_core_stores.BaseStore.html) interface,
which allows for storage of arbitrary data. However, LangChain components that require KV-storage accept a
more specific `BaseStore<string, Uint8Array>` instance that stores binary data (referred to as a `ByteStore`), and internally take care of
encoding and decoding data for their specific needs.

This means that as a user, you only need to think about one type of store rather than different ones for different types of data.

#### Interface

All [`BaseStores`](https://api.js.langchain.com/classes/langchain_core_stores.BaseStore.html) support the following interface. Note that the interface allows
for modifying **multiple** key-value pairs at once:

- `mget(keys: string[]): Promise<(undefined | Uint8Array)[]>`: get the contents of multiple keys, returning `None` if the key does not exist
- `mset(keyValuePairs: [string, Uint8Array][]): Promise<void>`: set the contents of multiple keys
- `mdelete(keys: string[]): Promise<void>`: delete multiple keys
- `yieldKeys(prefix?: string): AsyncGenerator<string>`: yield all keys in the store, optionally filtering by a prefix

For key-value store implementations, see [this section](/docs/integrations/stores/).

### Tools

<span data-heading-keywords="tool,tools"></span>
Expand Down
181 changes: 4 additions & 177 deletions docs/core_docs/docs/integrations/stores/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,183 +2,10 @@
sidebar_class_name: hidden
---

# Stores
# Key-value stores

Storing data in key value format is quick and efficient, and can be a powerful tool for LLM applications. The `BaseStore` class provides a simple interface for getting, setting, deleting and iterating over lists of key value pairs.
[Key-value stores](/docs/concepts/#key-value-stores) are used by other LangChain components to store and retrieve data.

The public API of `BaseStore` in LangChain JS offers four main methods:
import DocCardList from "@theme/DocCardList";

```typescript
abstract mget(keys: K[]): Promise<(V | undefined)[]>;

abstract mset(keyValuePairs: [K, V][]): Promise<void>;

abstract mdelete(keys: K[]): Promise<void>;

abstract yieldKeys(prefix?: string): AsyncGenerator<K | string>;
```

The `m` prefix stands for multiple, and indicates that these methods can be used to get, set and delete multiple key value pairs at once.
The `yieldKeys` method is a generator function that can be used to iterate over all keys in the store, or all keys with a given prefix.

It's that simple!

So far LangChain.js has two base integrations for `BaseStore`:

- [`InMemoryStore`](/docs/integrations/stores/in_memory)
- [`LocalFileStore`](/docs/integrations/stores/file_system) (Node.js only)

## Use Cases

### Chat history

If you're building web apps with chat, the `BaseStore` family of integrations can come in very handy for storing and retrieving chat history.

### Caching

The `BaseStore` family can be a useful alternative to our other caching integrations.
For example the [`LocalFileStore`](/docs/integrations/stores/file_system) allows for persisting data through the file system. It also is incredibly fast, so your users will be able to access cached data in a snap.

See the individual sections for deeper dives on specific storage providers.

## Reading Data

### In Memory

Reading data is simple with KV stores. Below is an example using the [`InMemoryStore`](/docs/integrations/stores/in_memory) and the `.mget()` method.
We'll also set our generic value type to `string` so we can have type safety setting our strings.

Import the [`InMemoryStore`](/docs/integrations/stores/in_memory) class.

```typescript
import { InMemoryStore } from "langchain/storage/in_memory";
```

Instantiate a new instance and pass `string` as our generic for the value type.

```typescript
const store = new InMemoryStore<string>();
```

Next we can call `.mset()` to write multiple values at once.

```typescript
const data: [string, string][] = [
["key1", "value1"],
["key2", "value2"],
];

await store.mset(data);
```

Finally, call the `.mget()` method to retrieve the values from our store.

```typescript
const data = await store.mget(["key1", "key2"]);

console.log(data);
/**
* ["value1", "value2"]
*/
```

### File System

When using the file system integration we need to instantiate via the `fromPath` method. This is required because it needs to preform checks to ensure the directory exists and is readable/writable.
You also must use a directory when using [`LocalFileStore`](/docs/integrations/stores/file_system) because each entry is stored as a unique file in the directory.

```typescript
import { LocalFileStore } from "langchain/storage/file_system";
```

```typescript
const pathToStore = "./my-store-directory";
const store = await LocalFileStore.fromPath(pathToStore);
```

To do this we can define an encoder for initially setting our data, and a decoder for when we retrieve data.

```typescript
const encoder = new TextEncoder();
const decoder = new TextDecoder();
```

```typescript
const data: [string, Uint8Array][] = [
["key1", encoder.encode(new Date().toDateString())],
["key2", encoder.encode(new Date().toDateString())],
];

await store.mset(data);
```

```typescript
const data = await store.mget(["key1", "key2"]);

console.log(data.map((v) => decoder.decode(v)));
/**
* [ 'Wed Jan 03 2024', 'Wed Jan 03 2024' ]
*/
```

## Writing Data

### In Memory

Writing data is simple with KV stores. Below is an example using the [`InMemoryStore`](/docs/integrations/stores/in_memory) and the `.mset()` method.
We'll also set our generic value type to `Date` so we can have type safety setting our dates.

Import the [`InMemoryStore`](/docs/integrations/stores/in_memory) class.

```typescript
import { InMemoryStore } from "langchain/storage/in_memory";
```

Instantiate a new instance and pass `Date` as our generic for the value type.

```typescript
const store = new InMemoryStore<Date>();
```

Finally we can call `.mset()` to write multiple values at once.

```typescript
const data: [string, Date][] = [
["date1", new Date()],
["date2", new Date()],
];

await store.mset(data);
```

### File System

When using the file system integration we need to instantiate via the `fromPath` method. This is required because it needs to preform checks to ensure the directory exists and is readable/writable.
You also must use a directory when using [`LocalFileStore`](/docs/integrations/stores/file_system) because each entry is stored as a unique file in the directory.

```typescript
import { LocalFileStore } from "langchain/storage/file_system";
```

```typescript
const pathToStore = "./my-store-directory";
const store = await LocalFileStore.fromPath(pathToStore);
```

When defining our data we must convert the values to `Uint8Array` because the file system integration only supports binary data.

To do this we can define an encoder for initially setting our data, and a decoder for when we retrieve data.

```typescript
const encoder = new TextEncoder();
const decoder = new TextDecoder();
```

```typescript
const data: [string, Uint8Array][] = [
["key1", encoder.encode(new Date().toDateString())],
["key2", encoder.encode(new Date().toDateString())],
];

await store.mset(data);
```
<DocCardList />
2 changes: 1 addition & 1 deletion docs/core_docs/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -332,7 +332,7 @@ module.exports = {
},
{
type: "category",
label: "Stores",
label: "Key-value stores",
collapsed: true,
items: [
{
Expand Down
Loading