Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: HTTPClient SSL Flag Does Not Work As Expected #2447

Open
erincetin opened this issue Jul 3, 2024 · 7 comments
Open

[Bug]: HTTPClient SSL Flag Does Not Work As Expected #2447

erincetin opened this issue Jul 3, 2024 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@erincetin
Copy link

erincetin commented Jul 3, 2024

What happened?

While using client-server mode of the Chroma, the HTTPClient does not work with self-signed certificates. Disabling ssl verification does not work either.

My chroma server is behind the OpenShift reverse proxy. The certificate that I am using is for this OpenShift cluster, and it works for other applications.

For example, below raw requests does work as expected:

import requests

chroma_server_url = ... # Your Chroma server url here

requests.get(chroma_server_url + '/api/v1/tenants/default_tenant', verify=True) # Does work as expected, gives self-signed certificate error
requests.get(chroma_server_url + '/api/v1/tenants/default_tenant', verify=False) # Works as certification is disabled
requests.get(chroma_server_url + '/api/v1/tenants/default_tenant', verify='ca.crt') # Works as ca.crt contains corporate certificate chain

However, using the HttpClient does not work as expected for ssl=False and certificate passing.

import chromadb

chroma_server_url = ... # Your Chroma server url here

client = chromadb.HttpClient(chroma_server_url, ssl=True) # Does work as expected, gives self-signed certificate error
client = chromadb.HttpClient(chroma_server_url, ssl=False) # Does not work, gives self-signed certificate error
client = chromadb.HttpClient(chroma_server_url, ssl=True, settings=Settings(chroma_server_ssl_verify='ca.crt')) # Does not work again

Reproduction Steps:

  1. Install OpenSSL to your system and testcontainers to Python.
  2. Run openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes -subj '/CN=localhost'
  3. Run the script below:
import time
from testcontainers.core.container import DockerContainer
from testcontainers.compose import DockerCompose
import requests
import os
import chromadb


# Define the Docker Compose setup using testcontainers
class ChromaDBTest:
    def __init__(self):
        self.compose = DockerCompose(os.getcwd())

    def start(self):
        self.compose.start()

    def stop(self):
        self.compose.stop()


# Create Docker Compose file content
docker_compose_content = """
version: '3'

services:
  chromadb:
    image: chromadb/chroma
    ports:
      - "8000:8000"

  nginx-proxy:
    image: nginx:latest
    depends_on:
      - chromadb
    ports:
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/conf.d/default.conf
      - ./certs:/etc/nginx/certs
"""

# Create necessary directories and files
os.makedirs("certs", exist_ok=True)
os.makedirs("nginx", exist_ok=True)
with open("certs/cert.pem", "w") as cert_file:
    with open("cert.pem", "r") as source_cert:
        cert_file.write(source_cert.read())
with open("certs/key.pem", "w") as key_file:
    with open("key.pem", "r") as source_key:
        key_file.write(source_key.read())
with open("nginx.conf", "w") as nginx_conf_file:
    nginx_conf_file.write("""
server {
    listen 443 ssl;
    server_name localhost;

    ssl_certificate /etc/nginx/certs/cert.pem;
    ssl_certificate_key /etc/nginx/certs/key.pem;

    location / {
        proxy_pass http://chromadb:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}
""")

with open("docker-compose.yml", "w") as compose_file:
    compose_file.write(docker_compose_content)


# Test the connection
def test_chromadb():
    chromadb_test = ChromaDBTest()
    try:
        chromadb_test.start()
        time.sleep(10)  # Wait for services to start

        try:
            response = requests.get("https://localhost:443", verify=True)
            print(response.text)
        except requests.exceptions.SSLError as e:
            print(f"SSL Error: {e}")

        try:
            client = chromadb.HttpClient("https://localhost", port=443, ssl=False)
        except requests.exceptions.SSLError as e:
            print(f"SSL Error for HTTP Client : {e}")

    finally:
        chromadb_test.stop()


if __name__ == "__main__":
    test_chromadb()

Result of the reproduction script is below:

Versions

Chroma v0.5.4, Python3.11, CentOS 8

Reproduction: Chroma v0.5.4, Python3.11, Windows 10

Relevant log output

SSL Error: HTTPSConnectionPool(host='localhost', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:1006)')))
Traceback (most recent call last):
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\urllib3\connectionpool.py", line 467, in _make_request
    self._validate_conn(conn)
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\urllib3\connectionpool.py", line 1092, in _validate_conn
    conn.connect()
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\urllib3\connection.py", line 642, in connect
    sock_and_verified = _ssl_wrap_socket_and_match_hostname(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\urllib3\connection.py", line 783, in _ssl_wrap_socket_and_match_hostname
    ssl_sock = ssl_wrap_socket(
               ^^^^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\urllib3\util\ssl_.py", line 469, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\urllib3\util\ssl_.py", line 513, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\ssl.py", line 517, in wrap_socket
    return self.sslsocket_class._create(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\ssl.py", line 1108, in _create
    self.do_handshake()
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\ssl.py", line 1379, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:1006)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\urllib3\connectionpool.py", line 790, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\urllib3\connectionpool.py", line 491, in _make_request
    raise new_e
urllib3.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:1006)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\requests\adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\urllib3\connectionpool.py", line 844, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\urllib3\util\retry.py", line 515, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='localhost', port=443): Max retries exceeded with url: /api/v1/tenants/default_tenant (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:1006)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\chromadb\api\client.py", line 438, in _validate_tenant_database
    self._admin_client.get_tenant(name=tenant)
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\chromadb\api\client.py", line 486, in get_tenant
    return self._server.get_tenant(name=name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\chromadb\telemetry\opentelemetry\__init__.py", line 143, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\chromadb\api\fastapi.py", line 182, in get_tenant
    resp = self._session.get(
           ^^^^^^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\requests\sessions.py", line 602, in get
    return self.request("GET", url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\requests\sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\requests\sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\requests\adapters.py", line 517, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='localhost', port=443): Max retries exceeded with url: /api/v1/tenants/default_tenant (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate (_ssl.c:1006)')))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\SnA\Desktop\ttkom\debug-chroma\debug.py", line 97, in <module>
    test_chromadb()
  File "C:\Users\SnA\Desktop\ttkom\debug-chroma\debug.py", line 88, in test_chromadb
    client = chromadb.HttpClient("https://localhost", port=443, ssl=False)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\chromadb\__init__.py", line 197, in HttpClient
    return ClientCreator(tenant=tenant, database=database, settings=settings)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\chromadb\api\client.py", line 144, in __init__
    self._validate_tenant_database(tenant=tenant, database=database)
  File "C:\Users\SnA\AppData\Local\Programs\Python\Python311\Lib\site-packages\chromadb\api\client.py", line 440, in _validate_tenant_database
    raise ValueError(
ValueError: Could not connect to a Chroma server. Are you sure it is running?
@erincetin erincetin added the bug Something isn't working label Jul 3, 2024
@erincetin erincetin changed the title [Bug]: HTTPClient Verify Flag Does Not Work As Expected [Bug]: HTTPClient SSL Flag Does Not Work As Expected Jul 3, 2024
@tazarov tazarov self-assigned this Jul 4, 2024
@tazarov
Copy link
Contributor

tazarov commented Jul 4, 2024

@erincetin, thanks for reporting this. We've recently refactored the HttpClient and started using httpx, although our SSL-related tests seem to be passing. I will investigate further, suggest an approach, or create a PR to fix the problem.

@erincetin
Copy link
Author

@tazarov Thanks for your reply. This was my first issue in Github, glad I contributed.

For the meantime, I want to downgrade Chromadb for my client-side. Do you know the latest version of Chromadb that did not use httpx for httpclient?

@erincetin
Copy link
Author

Tested versions from 0.5.3 to 0.4.12 with tox. Latest version that seems to work fine is 0.4.14, while from 0.4.15 to 0.5.3 fails. I have used the reproduction script that I have added above.

@tazarov
Copy link
Contributor

tazarov commented Jul 9, 2024

@erincetin, I remember that when adding this feature to Chroma client we faced the exact same issue.

The root cause why the below won't work (assuming both self-signed cert or private PKI):

client = chromadb.HttpClient(host="https://localhost", ssl=True, settings=Settings(chroma_server_ssl_verify='certs-no-san/servercert.pem')) # Does not work again

would be the lack of SAN (Subject Alternative Names). We have a config example here - https://github.com/chroma-core/chroma/blob/main/chromadb/test/openssl.cnf.

The second example you have is expected to fail with self-signed or private PKI.

client = chromadb.HttpClient(host="https://localhost", ssl=False) # Does not work, gives self-signed certificate error

I've tested certs with both SAN and without SAN and it works as intended with the SAN.

Try generating certs with this command:

openssl req -x509 -newkey rsa:4096 -keyout serverkey.pem -out servercert.pem -days 365 -nodes -subj "/CN=localhost" -config "chromadb/test/openssl.cnf"

Note: Update/replace the openssl.cnf as needed.

@tazarov
Copy link
Contributor

tazarov commented Jul 11, 2024

@erincetin, did adding SAN to your SSL cert solve the issue?

@erincetin
Copy link
Author

@tazarov, unfortunately I am using a corporate network for my project and they are issuing their certificates. So I cannot add SAN as I am not the person who creates or administers it.

@tazarov
Copy link
Contributor

tazarov commented Jul 11, 2024

@erincetin, this is super strange. In corporate SANs, this should be an essential requirement as they increase security. For example, k8s won't work with certs without SANs.

It is fine for the PKI to issue certs; however, I imagine that someone initiates the process with a CSR; if that is in your control, then SANs can be added to the CSR, too. Let me know if you need help with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants