You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
But if I instanciate my existing BigQueryVectorStore without adding text in the same environment I cannot retrieve document with filters.
Example:
Create a BigQueryVectorStore and use add_text to add documents.
In a saparete notebook use:
from langchain_google_vertexai import VertexAIEmbeddings
embedding = VertexAIEmbeddings(
model_name="textembedding-gecko@latest", project=PROJECT_ID
)
from langchain_google_community import BigQueryVectorStore
store = BigQueryVectorStore(
project_id=PROJECT_ID,
dataset_name=DATASET,
table_name=TABLE,
location=REGION,
embedding=embedding,
)
docs = store.similarity_search_by_vector(query_vector, filter={"len": 6})
print(docs)
I get the error:
File c:\Users\geoff\OneDrive\Documents\GitHub\pcd-data-eu-genai-diorastra-orch\.venv\Lib\site-packages\langchain_google_community\bq_storage_vectorstores\_base.py:387, in BaseBigQueryVectorStore.similarity_search_by_vectors(self, embeddings, filter, k, with_scores, with_embeddings, **kwargs)
...
--> [240](file:///C:/Users/XXXX/OneDrive/Documents/GitHub/XXXX/.venv/Lib/site-packages/langchain_google_community/bq_storage_vectorstores/bigquery.py:240) if self.table_schema[column] in ["INTEGER", "FLOAT"]: # type: ignore[index]
[241](file:///C:/Users/XXXX/OneDrive/Documents/GitHub/XXXX/.venv/Lib/site-packages/langchain_google_community/bq_storage_vectorstores/bigquery.py:241) filter_expressions.append(f"base.{column} = {value}")
[242](file:///C:/Users/XXXX/OneDrive/Documents/GitHub/XXXX/.venv/Lib/site-packages/langchain_google_community/bq_storage_vectorstores/bigquery.py:242) else:
TypeError: 'NoneType' object is not subscriptable
It is like using add_text update the store variable with the BQ Schema. And if you don't add the embedding the schema is None
When loading a existing BQ vector store with already embedded documents in it the table_schema variable is None:
But add_text update the schema with the schema of the loaded document:
When it is getting instanciate the BigQueryVectorStore should get the schema of te current table ?
A workaround would be to get the schema and update it mannually: store.table_schema = {'doc_id': 'STRING', 'content': 'STRING', 'embedding': 'FLOAT', 'len': 'INTEGER'}
Freezaa9
changed the title
BigQueryVectorStore error when retrieving docs with filters
BigQueryVectorStore error when retrieving docs with filters (PR proposal)
Aug 7, 2024
Hi,
When I follow this I can retrieve my documents with filters.
https://python.langchain.com/v0.2/docs/integrations/vectorstores/google_vertex_ai_vector_search/
But if I instanciate my existing BigQueryVectorStore without adding text in the same environment I cannot retrieve document with filters.
Example:
Create a BigQueryVectorStore and use add_text to add documents.
In a saparete notebook use:
I get the error:
It is like using add_text update the store variable with the BQ Schema. And if you don't add the embedding the schema is None
Same behavior when using the retriever:
Thanks for your help
UPDATE:
When loading a existing BQ vector store with already embedded documents in it the table_schema variable is None:
But add_text update the schema with the schema of the loaded document:
When it is getting instanciate the BigQueryVectorStore should get the schema of te current table ?
A workaround would be to get the schema and update it mannually:
store.table_schema = {'doc_id': 'STRING', 'content': 'STRING', 'embedding': 'FLOAT', 'len': 'INTEGER'}
Thanks again
Update:
PR: #429
The text was updated successfully, but these errors were encountered: