Unpatched ChromaDB flaw leaves servers open to remote code execution

Tags:

Researchers have published details about a critical vulnerability in ChromaDB that could allow unauthenticated attackers to execute arbitrary code and access sensitive data on machines running the open-source vector database.

The issue, tracked as CVE-2026-45829, is located in ChromaDB’s API server and was published by researchers at HiddenLayer after reportedly failing to get in contact with the developers of ChromaDB, one of the most popular vector databases used for AI applications.

The vulnerability stems from a race condition between the code ChromaDB uses to parse embedding model references and the code it uses to perform an authentication check. Attackers can exploit the flaw by sending requests to load malicious model configurations hosted on Hugging Face.

“The authentication is not missing, it’s just in the wrong place,” researchers from security firm HiddenLayer said in their report. “By the time it fires, the model has already been fetched and executed. The server rejects the request, returns a 500, and the attacker’s payload has already run.”

According to HiddenLayer, the flaw exists in ChromaDB from version 1.0.0 up to 1.5.8, and multiple attempts to report it to the developers since February using different communication channels have gone unanswered, prompting public disclosure. Over 73% of ChromaDB instances that are publicly accessible on the internet and are findable via the Shodan search engine are running a vulnerable version.

Until a patch becomes available, the researchers advise deploying ChromaDB servers using the Rust implementation, which is not affected, instead of the Python FastAPI server. Network access to the ChromaDB port should also be restricted to trusted IP addresses only.

Two separate issues combine into unauthenticated RCE

Vector databases like ChromaDB are often used to enhance the knowledge of LLMs with third-party or company-specific data as part of retrieval-augmented generation (RAG) workflows. That data, typically unstructured in origin, is stored in a vector database as mathematical representations called vector embeddings.

To convert unstructured data such as text, images, or audio into vector embeddings, specialized machine learning algorithms known as embeddings models must be used. These models can be specialized for specific use cases. As a result, ChromaDB and other vector databases give users the ability to choose between various embeddings models for these conversions.

ChromaDB orders documents into collections, and each collection can be assigned a specific embeddings function that dictates how documents are embedded, with what model, and with what parameters. One of those parameters can be trust_remote_code: true, which tells the model loader to download and execute any additional Python module files shipped with the model.

As a result, unauthenticated attackers can send a request to the ChromaDB API server to set up a new collection with a custom embeddings function that points to a malicious model they published on Hugging Face, HiddenLayer’s researchers found.

“This is the same class of risk we have written about before in the context of malicious models on Hugging Face and unsafe deserialization in ML artifacts,” the HiddenLayer researchers said. “A model is not passive data. It is code, and loading one from an untrusted source is equivalent to running untrusted code.”

But shouldn’t ChromaDB’s API endpoint authentication prevent this from happening?

This is where the second issue comes into play. It turns out that ChromaDB’s server code processes such requests before checking for authentication. And while processing the request, it fetches the model reference from Hugging Face to set up the embeddings configuration.

So even if the collection is ultimately not created because the eventual authentication check fails, the malicious Python code accompanying the model is still downloaded and executed.

“From the outside, it appears to be a failed API call,” the researchers said. “[But] on the attacker’s end, there is a shell on the server.”

Because the attacker’s code inherits the permissions of the user running the ChromaDB API server, it has access to everything on the machine the server process also has access to. This means environment variables, API keys, mounted secrets, and the data stored on disk.

This is the latest in a string of attacks that are made possible through maliciously crafted AI models and their corresponding configuration files. Earlier this month, HiddenLayer’s researchers showed how remote code execution can be achieved by making minor changes to a model’s tokenizer.json file, which is used to map token IDs to words or characters creating an alphabet the model uses to generate its outputs.

Last year, researchers showed how attackers can hide malicious code inside Python Pickle files, a format that is commonly used to distribute AI models.

Categories

No Responses

Leave a Reply

Your email address will not be published. Required fields are marked *