Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT]: Implement metadata-based filtering of documents for RAG #1858

Open
chewjh1234 opened this issue Jul 12, 2024 · 0 comments
Open

[FEAT]: Implement metadata-based filtering of documents for RAG #1858

chewjh1234 opened this issue Jul 12, 2024 · 0 comments
Labels
enhancement New feature or request feature request

Comments

@chewjh1234
Copy link

What would you like to see?

Currently, AnythingLLM's RAG system retrieves documents based solely on content similarity, and does not have filtering capabilities in its RAG system.
To improve retrieval relevance and support use cases involving time-sensitive documents or specific metadata attributes, I propose implementing metadata-based filtering capabilities for the RAG system.

Key benefits:

  • Enable time-based filtering for up-to-date information retrieval
  • Improve relevance by considering document attributes beyond content
  • Support use cases like summarization of specific document types or date ranges

Proposed functionality:

  1. Allow users to specify metadata filters when querying the RAG system
  2. Integrate metadata filtering with vector search to combine content and attribute-based relevance
  3. Support filtering on common metadata fields like date, document type, author, etc.

I learnt that Weaviate leverages an inverted index alongside the vector index to create an allow-list of eligible candidates before performing the vector search. Pinecone also allows for metadata filtering to be applied before vector searching, which can improve query performance.

I would really appreciate this feature as it would significantly enhance AnythingLLM's RAG capabilities, allowing users to more effectively narrow down relevant information and improve the accuracy of AI-generated responses.

@chewjh1234 chewjh1234 added enhancement New feature or request feature request labels Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature request
Projects
None yet
Development

No branches or pull requests

1 participant