Skip to content

Commit

Permalink
updated langchain issues
Browse files Browse the repository at this point in the history
  • Loading branch information
bracesproul committed Aug 12, 2024
1 parent c790746 commit 23f5218
Show file tree
Hide file tree
Showing 29 changed files with 60 additions and 60 deletions.
2 changes: 1 addition & 1 deletion docs/core_docs/docs/how_to/document_loader_markdown.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
"- Basic usage;\n",
"- Parsing of Markdown into elements such as titles, list items, and text.\n",
"\n",
"LangChain implements an [UnstructuredLoader](https://v02.api.js.langchain.com/classes/langchain_document_loaders_fs_unstructured.UnstructuredLoader.html) class.\n",
"LangChain implements an [UnstructuredLoader](https://v02.api.js.langchain.com/classes/langchain.document_loaders_fs_unstructured.UnstructuredLoader.html) class.\n",
"\n",
":::info Prerequisites\n",
"\n",
Expand Down
4 changes: 2 additions & 2 deletions docs/core_docs/docs/how_to/ensemble_retriever.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@ This guide assumes familiarity with the following concepts:

:::

The [EnsembleRetriever](https://api.js.langchain.com/classes/langchain_retrievers_ensemble.EnsembleRetriever.html) supports ensembling of results from multiple retrievers. It is initialized with a list of [BaseRetriever](https://api.js.langchain.com/classes/langchain_core.retrievers.BaseRetriever.html) objects. EnsembleRetrievers rerank the results of the constituent retrievers based on the [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) algorithm.
The [EnsembleRetriever](https://api.js.langchain.com/classes/langchain.retrievers_ensemble.EnsembleRetriever.html) supports ensembling of results from multiple retrievers. It is initialized with a list of [BaseRetriever](https://api.js.langchain.com/classes/langchain_core.retrievers.BaseRetriever.html) objects. EnsembleRetrievers rerank the results of the constituent retrievers based on the [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) algorithm.

By leveraging the strengths of different algorithms, the `EnsembleRetriever` can achieve better performance than any single algorithm.

One useful pattern is to combine a keyword matching retriever with a dense retriever (like embedding similarity), because their strengths are complementary. This can be considered a form of "hybrid search". The sparse retriever is good at finding relevant documents based on keywords, while the dense retriever is good at finding relevant documents based on semantic similarity.

Below we demonstrate ensembling of a [simple custom retriever](/docs/how_to/custom_retriever/) that simply returns documents that directly contain the input query with a retriever derived from a [demo, in-memory, vector store](https://api.js.langchain.com/classes/langchain_vectorstores_memory.MemoryVectorStore.html).
Below we demonstrate ensembling of a [simple custom retriever](/docs/how_to/custom_retriever/) that simply returns documents that directly contain the input query with a retriever derived from a [demo, in-memory, vector store](https://api.js.langchain.com/classes/langchain.vectorstores_memory.MemoryVectorStore.html).

import CodeBlock from "@theme/CodeBlock";
import Example from "@examples/retrievers/ensemble_retriever.ts";
Expand Down
2 changes: 1 addition & 1 deletion docs/core_docs/docs/how_to/multi_vector.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ This guide assumes familiarity with the following concepts:
:::

Embedding different representations of an original document, then returning the original document when any of the representations result in a search hit, can allow you to
tune and improve your retrieval performance. LangChain has a base [`MultiVectorRetriever`](https://v02.api.js.langchain.com/classes/langchain_retrievers_multi_vector.MultiVectorRetriever.html) designed to do just this!
tune and improve your retrieval performance. LangChain has a base [`MultiVectorRetriever`](https://v02.api.js.langchain.com/classes/langchain.retrievers_multi_vector.MultiVectorRetriever.html) designed to do just this!

A lot of the complexity lies in how to create the multiple vectors per document.
This guide covers some of the common ways to create those vectors and use the `MultiVectorRetriever`.
Expand Down
2 changes: 1 addition & 1 deletion docs/core_docs/docs/how_to/multiple_queries.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
"But retrieval may produce different results with subtle changes in query wording or if the embeddings do not capture the semantics of the data well.\n",
"Prompt engineering / tuning is sometimes done to manually address these problems, but can be tedious.\n",
"\n",
"The [`MultiQueryRetriever`](https://v02.api.js.langchain.com/classes/langchain_retrievers_multi_query.MultiQueryRetriever.html) automates the process of prompt tuning by using an LLM to generate multiple queries from different perspectives for a given user input query.\n",
"The [`MultiQueryRetriever`](https://v02.api.js.langchain.com/classes/langchain.retrievers_multi_query.MultiQueryRetriever.html) automates the process of prompt tuning by using an LLM to generate multiple queries from different perspectives for a given user input query.\n",
"For each query, it retrieves a set of relevant documents and takes the unique union across all queries to get a larger set of potentially relevant documents.\n",
"By generating multiple perspectives on the same question, the `MultiQueryRetriever` can help overcome some of the limitations of the distance-based retrieval and get a richer set of results.\n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion docs/core_docs/docs/how_to/parent_document_retriever.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ When splitting documents for retrieval, there are often conflicting desires:
1. You may want to have small documents, so that their embeddings can most accurately reflect their meaning. If documents are too long, then the embeddings can lose meaning.
2. You want to have long enough documents that the context of each chunk is retained.

The [`ParentDocumentRetriever`](https://v02.api.js.langchain.com/classes/langchain_retrievers_parent_document.ParentDocumentRetriever.html) strikes that balance by splitting and storing small chunks of data. During retrieval, it first fetches the small chunks but then looks up the parent ids for those chunks and returns those larger documents.
The [`ParentDocumentRetriever`](https://v02.api.js.langchain.com/classes/langchain.retrievers_parent_document.ParentDocumentRetriever.html) strikes that balance by splitting and storing small chunks of data. During retrieval, it first fetches the small chunks but then looks up the parent ids for those chunks and returns those larger documents.

Note that "parent document" refers to the document that a small chunk originated from. This can either be the whole raw document OR a larger chunk.

Expand Down
2 changes: 1 addition & 1 deletion docs/core_docs/docs/how_to/reduce_retrieval_latency.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ This guide assumes familiarity with the following concepts:
:::

One way to reduce retrieval latency is through a technique called "Adaptive Retrieval".
The [`MatryoshkaRetriever`](https://v02.api.js.langchain.com/classes/langchain_retrievers_matryoshka_retriever.MatryoshkaRetriever.html) uses the
The [`MatryoshkaRetriever`](https://v02.api.js.langchain.com/classes/langchain.retrievers_matryoshka_retriever.MatryoshkaRetriever.html) uses the
Matryoshka Representation Learning (MRL) technique to retrieve documents for a given query in two steps:

- **First-pass**: Uses a lower dimensional sub-vector from the MRL embedding for an initial, fast,
Expand Down
2 changes: 1 addition & 1 deletion docs/core_docs/docs/how_to/sql_prompting.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ import DbCheck from "@examples/use_cases/sql/db_check.ts";
## Dialect-specific prompting

One of the simplest things we can do is make our prompt specific to the SQL dialect we're using.
When using the built-in [`createSqlQueryChain`](https://v02.api.js.langchain.com/functions/langchain_chains_sql_db.createSqlQueryChain.html) and [`SqlDatabase`](https://v02.api.js.langchain.com/classes/langchain_sql_db.SqlDatabase.html), this is handled for you for any of the following dialects:
When using the built-in [`createSqlQueryChain`](https://v02.api.js.langchain.com/functions/langchain.chains_sql_db.createSqlQueryChain.html) and [`SqlDatabase`](https://v02.api.js.langchain.com/classes/langchain.sql_db.SqlDatabase.html), this is handled for you for any of the following dialects:

import DialectExample from "@examples/use_cases/sql/prompting/list_dialects.ts";

Expand Down
2 changes: 1 addition & 1 deletion docs/core_docs/docs/how_to/time_weighted_vectorstore.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ This guide assumes familiarity with the following concepts:

:::

This guide covers the [`TimeWeightedVectorStoreRetriever`](https://v02.api.js.langchain.com/classes/langchain_retrievers_time_weighted.TimeWeightedVectorStoreRetriever.html),
This guide covers the [`TimeWeightedVectorStoreRetriever`](https://v02.api.js.langchain.com/classes/langchain.retrievers_time_weighted.TimeWeightedVectorStoreRetriever.html),
which uses a combination of semantic similarity and a time decay.

The algorithm for scoring them is:
Expand Down
2 changes: 1 addition & 1 deletion docs/core_docs/docs/how_to/vectorstores.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ vectors, and then at query time to embed the unstructured query and retrieve the
'most similar' to the embedded query. A vector store takes care of storing embedded data and performing vector search
for you.

This walkthrough uses a basic, unoptimized implementation called [`MemoryVectorStore`](https://v02.api.js.langchain.com/classes/langchain_vectorstores_memory.MemoryVectorStore.html) that stores embeddings in-memory and does an exact, linear search for the most similar embeddings.
This walkthrough uses a basic, unoptimized implementation called [`MemoryVectorStore`](https://v02.api.js.langchain.com/classes/langchain.vectorstores_memory.MemoryVectorStore.html) that stores embeddings in-memory and does an exact, linear search for the most similar embeddings.
LangChain contains many built-in integrations - see [this section](/docs/how_to/vectorstores/#which-one-to-pick) for more, or the [full list of integrations](/docs/integrations/vectorstores/).

## Creating a new index
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
"\n",
"```\n",
"\n",
"This notebook provides a quick overview for getting started with `DirectoryLoader` [document loaders](/docs/concepts/#document-loaders). For detailed documentation of all `DirectoryLoader` features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_document_loaders_fs_directory.DirectoryLoader.html).\n",
"This notebook provides a quick overview for getting started with `DirectoryLoader` [document loaders](/docs/concepts/#document-loaders). For detailed documentation of all `DirectoryLoader` features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain.document_loaders_fs_directory.DirectoryLoader.html).\n",
"\n",
"This example goes over how to load data from folders with multiple files. The second argument is a map of file extensions to loader factories. Each file will be passed to the matching loader, and the resulting documents will be concatenated together.\n",
"\n",
Expand All @@ -45,7 +45,7 @@
"\n",
"| Class | Package | Compatibility | Local | PY support | \n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"| [DirectoryLoader](https://api.js.langchain.com/classes/langchain_document_loaders_fs_directory.DirectoryLoader.html) | [langchain](https://api.js.langchain.com/modules/langchain_document_loaders_fs_directory.html) | Node-only | βœ… | βœ… |\n",
"| [DirectoryLoader](https://api.js.langchain.com/classes/langchain.document_loaders_fs_directory.DirectoryLoader.html) | [langchain](https://api.js.langchain.com/modules/langchain.document_loaders_fs_directory.html) | Node-only | βœ… | βœ… |\n",
"\n",
"## Setup\n",
"\n",
Expand Down Expand Up @@ -160,7 +160,7 @@
"source": [
"## API reference\n",
"\n",
"For detailed documentation of all DirectoryLoader features and configurations head to the API reference: https://api.js.langchain.com/classes/langchain_document_loaders_fs_directory.DirectoryLoader.html"
"For detailed documentation of all DirectoryLoader features and configurations head to the API reference: https://api.js.langchain.com/classes/langchain.document_loaders_fs_directory.DirectoryLoader.html"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,14 +26,14 @@
"\n",
"```\n",
"\n",
"This notebook provides a quick overview for getting started with `TextLoader` [document loaders](/docs/concepts/#document-loaders). For detailed documentation of all `TextLoader` features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_document_loaders_fs_text.TextLoader.html).\n",
"This notebook provides a quick overview for getting started with `TextLoader` [document loaders](/docs/concepts/#document-loaders). For detailed documentation of all `TextLoader` features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain.document_loaders_fs_text.TextLoader.html).\n",
"\n",
"## Overview\n",
"### Integration details\n",
"\n",
"| Class | Package | Compatibility | Local | PY support | \n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"| [TextLoader](https://api.js.langchain.com/classes/langchain_document_loaders_fs_text.TextLoader.html) | [langchain](https://api.js.langchain.com/modules/langchain_document_loaders_fs_text.html) | Node-only | βœ… | ❌ |\n",
"| [TextLoader](https://api.js.langchain.com/classes/langchain.document_loaders_fs_text.TextLoader.html) | [langchain](https://api.js.langchain.com/modules/langchain.document_loaders_fs_text.html) | Node-only | βœ… | ❌ |\n",
"\n",
"## Setup\n",
"\n",
Expand Down Expand Up @@ -132,7 +132,7 @@
"source": [
"## API reference\n",
"\n",
"For detailed documentation of all TextLoader features and configurations head to the API reference: https://api.js.langchain.com/classes/langchain_document_loaders_fs_text.TextLoader.html"
"For detailed documentation of all TextLoader features and configurations head to the API reference: https://api.js.langchain.com/classes/langchain.document_loaders_fs_text.TextLoader.html"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,7 @@
"source": [
"## Directories\n",
"\n",
"You can also load all of the files in the directory using [`UnstructuredDirectoryLoader`](https://v02.api.js.langchain.com/classes/langchain_document_loaders_fs_unstructured.UnstructuredDirectoryLoader.html), which inherits from [`DirectoryLoader`](/docs/integrations/document_loaders/file_loaders/directory):\n"
"You can also load all of the files in the directory using [`UnstructuredDirectoryLoader`](https://v02.api.js.langchain.com/classes/langchain.document_loaders_fs_unstructured.UnstructuredDirectoryLoader.html), which inherits from [`DirectoryLoader`](/docs/integrations/document_loaders/file_loaders/directory):\n"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Sitemap Loader

This notebook goes over how to use the [`SitemapLoader`](https://v02.api.js.langchain.com/classes/langchain_document_loaders_web_sitemap.SitemapLoader.html) class to load sitemaps into `Document`s.
This notebook goes over how to use the [`SitemapLoader`](https://v02.api.js.langchain.com/classes/langchain.document_loaders_web_sitemap.SitemapLoader.html) class to load sitemaps into `Document`s.

## Setup

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
"source": [
"# Chroma\n",
"\n",
"This guide will help you getting started with such a retriever backed by a [Chroma vector store](/docs/integrations/vectorstores/chroma). For detailed documentation of all features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_retrievers_self_query.SelfQueryRetriever.html).\n",
"This guide will help you getting started with such a retriever backed by a [Chroma vector store](/docs/integrations/vectorstores/chroma). For detailed documentation of all features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain.retrievers_self_query.SelfQueryRetriever.html).\n",
"\n",
"## Overview\n",
"\n",
Expand Down Expand Up @@ -384,7 +384,7 @@
"source": [
"## API reference\n",
"\n",
"For detailed documentation of all Chroma self-query retriever features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_retrievers_self_query.SelfQueryRetriever.html)."
"For detailed documentation of all Chroma self-query retriever features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain.retrievers_self_query.SelfQueryRetriever.html)."
]
}
],
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
"source": [
"# HNSWLib\n",
"\n",
"This guide will help you getting started with such a retriever backed by a [HNSWLib vector store](/docs/integrations/vectorstores/hnswlib). For detailed documentation of all features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_retrievers_self_query.SelfQueryRetriever.html).\n",
"This guide will help you getting started with such a retriever backed by a [HNSWLib vector store](/docs/integrations/vectorstores/hnswlib). For detailed documentation of all features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain.retrievers_self_query.SelfQueryRetriever.html).\n",
"\n",
"## Overview\n",
"\n",
Expand Down Expand Up @@ -378,7 +378,7 @@
"source": [
"## API reference\n",
"\n",
"For detailed documentation of all HNSWLib self-query retriever features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_retrievers_self_query.SelfQueryRetriever.html)."
"For detailed documentation of all HNSWLib self-query retriever features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain.retrievers_self_query.SelfQueryRetriever.html)."
]
}
],
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
"source": [
"# In-memory\n",
"\n",
"This guide will help you getting started with such a retriever backed by an [in-memory vector store](/docs/integrations/vectorstores/memory). For detailed documentation of all features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_retrievers_self_query.SelfQueryRetriever.html).\n",
"This guide will help you getting started with such a retriever backed by an [in-memory vector store](/docs/integrations/vectorstores/memory). For detailed documentation of all features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain.retrievers_self_query.SelfQueryRetriever.html).\n",
"\n",
"## Overview\n",
"\n",
Expand All @@ -33,7 +33,7 @@
"\n",
"| Backing vector store | Self-host | Cloud offering | Package | Py support |\n",
"| :--- | :--- | :---: | :---: | :---: |\n",
"[`MemoryVectorStore`](https://api.js.langchain.com/classes/langchain_vectorstores_memory.MemoryVectorStore.html) | βœ… | ❌ | [`langchain`](https://www.npmjs.com/package/langchain) | ❌ |\n",
"[`MemoryVectorStore`](https://api.js.langchain.com/classes/langchain.vectorstores_memory.MemoryVectorStore.html) | βœ… | ❌ | [`langchain`](https://www.npmjs.com/package/langchain) | ❌ |\n",
"\n",
"## Setup\n",
"\n",
Expand Down Expand Up @@ -378,7 +378,7 @@
"source": [
"## API reference\n",
"\n",
"For detailed documentation of all in-memory self-query retriever features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain_retrievers_self_query.SelfQueryRetriever.html)."
"For detailed documentation of all in-memory self-query retriever features and configurations head to the [API reference](https://api.js.langchain.com/classes/langchain.retrievers_self_query.SelfQueryRetriever.html)."
]
}
],
Expand Down
Loading

0 comments on commit 23f5218

Please sign in to comment.