Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: add_documents should allow passing of embeddings #2404

Open
42degrees opened this issue Jun 23, 2024 · 1 comment
Open
Labels
enhancement New feature or request

Comments

@42degrees
Copy link

Describe the problem

I'm very new to all this, so maybe I'm missing something, but I can't figure out how to do what I want to do. I see in the documentation that if I want to create a whole vector database at once, including embeddings, I can call Chroma.from_documents and pass a set of embeddings for each document. I can call collection.Add and pass a set of embeddings. So, why can't I call db.add_documents(documents, ids, embeddings)? It seems like this should be a reasonable request, but the only way to call add_documents is to let Chroma be in control of calling the embedding function and in my situation I already have the pre-calculated embeddings, but I am gathering it in batches, so I don't have everything to call from_documents (and I'm not 100% sure that ends up making the same database to use for RAG). Unless I'm missing something?

Describe the proposed solution

All methods of adding documents to Chroma support the same methods of adding embeddings.

Alternatives considered

I'm not sure if calling db.add_documents is actually adding to some default collection anyway, so maybe the solution is to get the default collection and then use collection.Add()? I did some googling and didn't find anywhere talking about a "default" collection or if one exists how I would go about getting it with get_collection()?

Importance

would make my life easier

Additional Information

No response

@42degrees 42degrees added the enhancement New feature or request label Jun 23, 2024
@jeffchuber
Copy link
Contributor

@42degrees add_documents is a langchain API.

You can absolutely add data directly to Chroma with precomputed embeddings. Check out this part of the docs

https://docs.trychroma.com/reference/py-collection#add

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants