Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Error when loading chromadb collection froma docker container #2522

Open
SHUCHISMIT12 opened this issue Jul 16, 2024 · 4 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@SHUCHISMIT12
Copy link

What happened?

I've created a docker container where I have kept my chromadb folder. While testing the docker locally the function works but when deploying on AWS Lamba it throws an error - ""errorMessage": "attempt to write a readonly database",
"errorType": "OperationalError",
"requestId": "43522161-5c70-4a9c-bc71-b8b828d1bb5d",

This is the code block where I'm getting the error -

client = chromadb.PersistentClient(path=os.getcwd() + "/chroma_db")

In the dockerfile I have copied the folder to cwd

COPY chroma_db ${LAMBDA_TASK_ROOT}/chroma_db
Please suggest

Versions

Chromadb 0.5.0
Python 3.11

Relevant log output

errorMessage": "attempt to write a readonly database",
  "errorType": "OperationalError",
  "requestId": "43522161-5c70-4a9c-bc71-b8b828d1bb5d",
@SHUCHISMIT12 SHUCHISMIT12 added the bug Something isn't working label Jul 16, 2024
@tazarov
Copy link
Contributor

tazarov commented Jul 16, 2024

@SHUCHISMIT12, I suspect a read-only filesystem is the source of your troubles. We generally do not recommend running core Chroma package in serverless functions due to the cold start and other issues like filesystems etc (as you've encountered)

@SHUCHISMIT12
Copy link
Author

@tazarov thank you for your response. Unfortunately I only have a serverless option for deployment this time. Do you suggest reading the chromadb collection from an S3 bucket into the lambda function ?

@tazarov
Copy link
Contributor

tazarov commented Jul 16, 2024

if your use case is a read-only Chroma, mounting an S3 and using that might somehow work.

When you say "only serverless option", do you exclude the possibility of having an EC2 that runs the actual Chroma server?

@SHUCHISMIT12
Copy link
Author

@tazarov, I'm currently working on a pilot project within my organisation. Initially, due to the project's limited scale, it's challenging for me to justify a separate instance solely for hosting the index. Also , hibernating the instance after each query would impact the user experience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants