Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: chromadb 0.5.4 crashes on windows #2513

Open
petacube opened this issue Jul 13, 2024 · 38 comments
Open

[Bug]: chromadb 0.5.4 crashes on windows #2513

petacube opened this issue Jul 13, 2024 · 38 comments
Assignees
Labels
bug Something isn't working

Comments

@petacube
Copy link

What happened?

running collection.add function crashes after 100 documents are inserted

Versions

chromadb 0.5.4, python 3.9;

Relevant log output

No response

@petacube petacube added the bug Something isn't working label Jul 13, 2024
@petacube
Copy link
Author

rolling the code back to 0.5.0 release of chromadb resolves the issue.
please explain what is going on with crash

@HammadB
Copy link
Collaborator

HammadB commented Jul 13, 2024

Do you have a stack trace or any output?

@petacube
Copy link
Author

it crashes silently. the whole python process dies. there is not even exception thrown.
i can try testing with on linux tmr to see if i can replicate the crash and run systrace to see if core dump can be captured.

@tazarov
Copy link
Contributor

tazarov commented Jul 13, 2024

similar/same issue reported in discord - https://discord.com/channels/1073293645303795742/1261229903383236720

Windows Fatal Exception: Access Violation

image

@tazarov
Copy link
Contributor

tazarov commented Jul 13, 2024

@petacube, unable to reproduce on GH windows-latest

Here's the test code - https://github.com/amikos-tech/chrm-2513-exp/blob/main/test_import.py

With the following WF - https://github.com/amikos-tech/chrm-2513-exp/actions/runs/9919966674/workflow

Conda env with Python 3.9 and Chroma 0.5.4

I tried adding things in bulk and separately. I also intentionally have high dimensional vectors (4096).

Let me know if you encounter the error in a similar setting.

@HammadB
Copy link
Collaborator

HammadB commented Jul 13, 2024

Hmm, I wonder if this is due to a chroma-hnswlib version mismatch. Can you run pip show chroma-hnswlib? It should be 0.7.5 for chroma 0.5.4

@petacube
Copy link
Author

my version of chroma-hnswlib is 0.7.3
should not the dependency like this be handled at chromadb level ?

@HammadB
Copy link
Collaborator

HammadB commented Jul 14, 2024

'chroma-hnswlib==0.7.5',

It is set here, i am not sure how you updated but maybe something went wrong. Can you upgrade the dep and try again

@petacube
Copy link
Author

i did pip install --upgrade chromadb==0.5.4, so probably that does not upgrade dependencies possibly?

@kaixxx
Copy link

kaixxx commented Jul 16, 2024

I had the same issue: Silent crash after updating to chromadb 0.5.4 on Windows EVEN WITH chroma-hnswlib vers. 0.7.5

I moved back to chromadb 0.5.0 and chroma-hnswlib 0.7.3 and everything is working like before.

@tazarov
Copy link
Contributor

tazarov commented Jul 16, 2024

@kaixxx, can you confirm whether you were using anaconda? A user in Discord reported that the problems were resolved when he switched from anaconda to pip.

On a related note: If your environment rebuilds the chroma-hnsw lib that can be the culprit. Can you let me know what Python version and CPU Arch you have? We have prebuilt wheels for amd64 only on Windows (py39-py312).

@kaixxx
Copy link

kaixxx commented Jul 16, 2024

Thanks for looking into this. Here is some additional info:

  • I am using anaconda to manage my environments. However, I do not install any packages from Anaconda but use pip for everything.
  • hnswlib: After reading the above message about the possibIe problem with version 0.7.3 I've checked that I had 0.7.5 installed. Chromadb still crashed.
  • I am using Python 3.10.13
  • CPU: AMD Ryzen 7 6800U

@tazarov
Copy link
Contributor

tazarov commented Jul 16, 2024

@kaixxx, in your venv can you run the following code with python:

import hnswlib
import numpy as np

index = hnswlib.Index(space="l2", dim=1024)
index.init_index(max_elements=1000, ef_construction=100, M=16)
vectors = np.random.randn(1000, 1024).astype(np.float32)
index.add_items(vectors,ids=np.arange(1000))

Let me know if this crashes

@kaixxx
Copy link

kaixxx commented Jul 16, 2024

Yes, it seems to crash.
I've created a new environment, installed chromadb (0.5.4 with chroma-hnswlib 0.7.5).
Then I've added the line print('finished') to the end of your script. This line is never reached. The script exits silently without any error message.
In my other environment with chromadb 0.5.0, the script runs fine and prints 'finished'.

@kaixxx
Copy link

kaixxx commented Jul 16, 2024

Another test: I've now downgraded to chroma-hnswlib 0.7.3 but kept chromadb 0.5.4 and your script runs fine!

@tazarov
Copy link
Contributor

tazarov commented Jul 16, 2024

@kaixxx thanks for confirming. Can you add debug prints like this to identify whether it fails in the init of the index or when adding vectors:

import hnswlib
import numpy as np

index = hnswlib.Index(space="l2", dim=1024)
print("New index - ok")
index.init_index(max_elements=1000, ef_construction=100, M=16)
print("Init index - ok")
vectors = np.random.randn(1000, 1024).astype(np.float32)
index.add_items(vectors,ids=np.arange(1000))
print("All good")

@kaixxx
Copy link

kaixxx commented Jul 16, 2024

Yes, output:

New index - ok
Init index - ok

(no "All good")

@tazarov
Copy link
Contributor

tazarov commented Jul 16, 2024

@kaixxx, fantastic. Thank you for following up. 0.7.5 adds this change to add_items functionality - chroma-core/hnswlib@408c5d1?diff=split&w=0#diff-ab27cbb27975c68cb0c6da824871058623f7f76a761c3c8365ef2e1395cf7cd9R1706-R1708

Can I ask you to rebuild the HNSW lib locally (if you have the necessary deps):

pip install --no-binary :all: chroma-hnswlib==0.7.5

@kaixxx
Copy link

kaixxx commented Jul 17, 2024

Hey @tazarov, I've tried to build it but it results in an error from the linker that a certain file could not be opened. It may be that my build environment is not set up properly, but I don't have the time to dig into that. Is there anything else I can do?

@dddxst
Copy link

dddxst commented Jul 18, 2024

when the document's length big enough and insert the 100th , then the bug will occur, Whether you insert data one by one or all at once

@atroyn
Copy link
Contributor

atroyn commented Jul 19, 2024

Reproduced for python 3.12 and 3.10 on our windows machine (though this does not show up in CI, we should figure out why - perhaps the number of embeddings we insert in CI is not large enough to trigger this).

@HammadB and I are looking into it.

@HammadB
Copy link
Collaborator

HammadB commented Jul 19, 2024

I have confirmed that running with --no-binary (building from source) fixes this as a workaround. This points to an issue in the wheel build. Investigating further.

@HammadB
Copy link
Collaborator

HammadB commented Jul 19, 2024

It seems the windows wheels were building with AVX/SSE enabled if the runners they were compiled on had it, I guess previously for 0.7.3 the runner just happened to not have AVX/SSE but now it does. I have pushed an alpha release 0.7.6.alpha1.

@dddxst and @kaixxx and @petacube can you pip install chroma-hnswlib==0.7.6a1 and let me know if that fixes your issue? If so, I can issue a main release. Thanks.

@kaixxx
Copy link

kaixxx commented Jul 19, 2024

Thanks!
I've tested chroma-hnswlib 0.7.6a1 with the above script and it still crashes, unfortunately. Exactly the same behavior as described in #2513 (comment)

@atroyn
Copy link
Contributor

atroyn commented Jul 19, 2024

Have reproduced the 0.7.6a1 failure on our windows machine. The next step is to put a debugger on the cpp code itself. This will be a bit hairy but will coordinate with @HammadB to ship a fix.

@EricBLivingston
Copy link

I had the same problem with 0.5.5 and downgrading to 0.5.3/0.7.3 has solved it for now!

@HammadB
Copy link
Collaborator

HammadB commented Jul 23, 2024

@EricBLivingston what version of python are you on?

@EricBLivingston
Copy link

@EricBLivingston what version of python are you on?

Version 3.11.9

@tazarov
Copy link
Contributor

tazarov commented Jul 25, 2024

It would appear that the issue exists on hnswlib 0.7.3 too (Windows 10, AMD Ryzen 5) - https://discord.com/channels/1073293645303795742/1265778818422145149

@atroyn
Copy link
Contributor

atroyn commented Jul 25, 2024

It would appear that the issue exists on hnswlib 0.7.3 too (Windows 10, AMD Ryzen 5) - https://discord.com/channels/1073293645303795742/1265778818422145149

@tazarov can you please post a summary of this long conversation here for easy reference? There is a lot going on and it's unclear to me what the issue is. Which python version is the user on?

@tazarov
Copy link
Contributor

tazarov commented Jul 25, 2024

@atroyn, the use has the following config:

Window 10
AMD Ryzen 5 3600xt
Python 3.12
Running in local jupyter notebook

Versions where they manage to reproduce the bug: 0.5.3 and 0.5.4

They had build chain (msvc) and tried to build from source, but encountered a build error (something related to ninja). Attaching the build failure here.
chroma-hnswlib==0.7.3-build-failure.txt

@atroyn
Copy link
Contributor

atroyn commented Jul 28, 2024

We should advise users on Windows to downgrade to python 3.10 for their Chroma environments.

@Latetide
Copy link

Latetide commented Aug 12, 2024

I just had a similar error. Using chroma in Jupyter Notebook, the kernel shuts down and restarts after trying to insert the 100th element.

Versions:
chromadb: 0.5.5
chroma-hnswlib: 0.7.6

I tested this with this code based on the above, it works in command line python without any issues, but it crashes again when I try to run it in the notebook.

import hnswlib
import numpy as np

print("Starting")

index = hnswlib.Index(space="l2", dim=1024)
print("After index declaration")

index.init_index(max_elements=1000, ef_construction=100, M=16)
print("After Init Index")

vectors = np.random.randn(1000, 1024).astype(np.float32)
print("After Vector creation")

index.add_items(vectors,ids=np.arange(1000))

print("Done")

The kernel log only says this with Debug mode: (the first json block is the last thing I printed before the crash)

{'buffers': [],
 'content': b'{"name": "stdout", "text": "Number of existing docs in DB: 99\\nN'
            b'umber of new chunks: 1\\n"}',
 'header': {'date': datetime.datetime(2024, 8, 12, 13, 33, 23, 677442, tzinfo=tzutc()),
            'msg_id': '91678410-7c9cf6a1b73f88beb4462c14_10376_136',
            'msg_type': 'stream',
            'session': '91678410-7c9cf6a1b73f88beb4462c14',
            'username': 'username',
            'version': '5.3'},
 'metadata': {},
 'msg_id': '91678410-7c9cf6a1b73f88beb4462c14_10376_136',
 'msg_type': 'stream',
 'parent_header': {'date': datetime.datetime(2024, 8, 12, 13, 33, 23, 439000, tzinfo=tzutc()),
                   'msg_id': 'ea6cbfb9-0227-4f8e-9717-e8bc860aef91',
                   'msg_type': 'execute_request',
                   'session': '8ba47159-9da6-45dc-940c-df463dd8f2c0',
                   'username': '',
                   'version': '5.2'}}
[I 2024-08-12 14:33:27.083 ServerApp] AsyncIOLoopKernelRestarter: restarting kernel (1/5), keep random ports
[W 2024-08-12 14:33:27.083 ServerApp] kernel 2975483f-61dd-4ee6-bdf0-a04c8a086712 restarted

When executing the lines one by one, this is what crashes (as expected):
index.add_items(vectors,ids=np.arange(1000))

@atroyn
Copy link
Contributor

atroyn commented Aug 15, 2024

@Latetide are you also on Windows? Could you please post the results of running msinfo32 if so?

@Latetide
Copy link

Yes, Windows 10. Python version 3.12.4, jupyter version 7.2.1

Here is the msinfo file:
msinfo.txt

@atroyn
Copy link
Contributor

atroyn commented Aug 16, 2024

Could you downgrade to python 3.10 in your Chroma environment, reinstall, and try again?

@mferris77
Copy link

mferris77 commented Aug 18, 2024

FYI stumbled across this while working through a lanchain demo. :) Same experience as others above - VSCode, Jupyter NB crashes when I try to load text chunks - it loaded 10 fine, when I bumped it to 50 it crashes. The test script above (import hnswlib, numpy, etc) also ended in a crash.

Windows 11
Chroma 0.5.5
chroma-hnwslib: 0.7.6
Python 3.10.11
ipykernel 6.29.5
pytorch 2.4.0
cuda 12.4

Hope this helps! I'll roll back to a previous working version noted above.

@Tony1040
Copy link

I am getting this error too on Windows 11 while using langchain chroma. These are the details of my system:

Windows 11
Ryzen 5 2400G
Python 3.12
chromadb==0.5.5

Downgrading to python 3.10 seems to work/fix the issue

python --version
Python 3.10.14
pip freeze | grep -i chroma
chroma-hnswlib==0.7.6
chromadb==0.5.5
langchain-chroma==0.1.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

10 participants