Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Metadata Filters: having common tags or not #2487

Open
anyangml opened this issue Jul 10, 2024 · 2 comments
Open

[Feature Request]: Metadata Filters: having common tags or not #2487

anyangml opened this issue Jul 10, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@anyangml
Copy link

Describe the problem

Trying to build a database to perform similarity search on inorganic materials. It would be very useful it there is a filter that allows the users to filter based on the elemental composition. For example, if vector A represents a material containing ["Si", "O", "Al"], vector B represents a material containing ["Mg", "Ni", "O"]. The filter should have a contains/excludes logic.

This could also be useful for general purpose, e.g. filtering text based on topics.

Describe the proposed solution

To add contains/excludes logic to meta data filters.

Alternatives considered

No response

Importance

nice to have

Additional Information

No response

@anyangml anyangml added the enhancement New feature or request label Jul 10, 2024
@HammadB
Copy link
Collaborator

HammadB commented Jul 10, 2024

Can you clarify the shape of your metadata here? We do support contains via the where_document clause - https://docs.trychroma.com/guides#filtering-by-document-contents - does that work for your needs?

@anyangml
Copy link
Author

Can you clarify the shape of your metadata here? We do support contains via the where_document clause - https://docs.trychroma.com/guides#filtering-by-document-contents - does that work for your needs?

I might be wrong, but my understanding is that the document filter where_document only works for text. In my case, however, the vector comes from an encoder that converts a 3D structure into a 1D vector; there is no actual document to search within. Therefore, I don't think I can treat the 3D structure as "the document" in the case of a RAG task. A field in the metadata might look like this:

metadata = {
    "elements" : ["Mg", "Al", "O"], # List[str]
    "natoms": 16, # int
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants