Local Memory: Getting Started with ChromaDB
Dec 29, 2025 • 10 min read
Not every project needs a managed cloud database like Pinecone. For local development, privacy-focused apps, or rapid prototyping, ChromaDB is the gold standard open-source vector database. It's fast to set up, free to use, and surprisingly capable for production workloads under 10 million vectors.
What is a Vector Database and Why Do You Need One?
Before diving into ChromaDB specifically, let's understand the problem it solves. Traditional databases store and retrieve data by exact value matching—find all rows where name = "Alice". This works perfectly for structured data, but completely breaks down for AI applications where you need to find semantically similar content.
Imagine you have a knowledge base of 10,000 support articles. A user asks "my payment isn't going through." You can't exact-match this—the article that helps them might be titled "Resolving Credit Card Authorization Failures." A vector database stores documents as mathematical vectors (embeddings) that capture meaning, allowing you to find the most semantically similar results regardless of exact word choice.
This is the foundation of every RAG (Retrieval-Augmented Generation) system. ChromaDB handles the storage, indexing, and retrieval of these vectors so you can focus on building your AI application.
1. Why ChromaDB?
ChromaDB was purpose-built for AI developers. Here's what makes it stand out from generic databases:
- Open Source (Apache 2.0): No usage fees, no vendor lock-in. Run it anywhere—Docker, Kubernetes, or embedded in your Python process.
- Embedded Mode: ChromaDB can run inside your Python script without a separate server process. Perfect for local development and testing.
- Batteries-Included Embeddings: Ships with
all-MiniLM-L6-v2so you can test without an OpenAI API key. - LangChain & LlamaIndex Integration: Drop-in compatible with both major AI frameworks—switch from in-memory to ChromaDB with one line change.
- Hybrid Search: Supports both semantic (vector) and metadata filtering in a single query, unlike simpler solutions.
- JavaScript & Python SDKs: Full-featured clients for both ecosystems with identical APIs.
2. Setup: Three Ways to Run ChromaDB
Option A: Embedded (Simplest — Great for Development)
No server needed. Chroma runs inside your Python process and stores data on disk:
import chromadb
# Data persists to ./chroma_data directory
client = chromadb.PersistentClient(path="./chroma_data")Option B: Docker Server (For Production)
# Start ChromaDB server
docker run -p 8000:8000 -v ./chroma_data:/chroma/chroma chromadb/chroma
# Connect from your app
import chromadb
client = chromadb.HttpClient(host="localhost", port=8000)Option C: JavaScript Client
npm install chromadb chromadb-default-embed
import { ChromaClient } from 'chromadb';
const client = new ChromaClient({ path: "http://localhost:8000" });3. Core Operations
Creating Collections and Adding Documents
import { ChromaClient, OpenAIEmbeddingFunction } from 'chromadb';
const client = new ChromaClient();
// Use OpenAI embeddings for production-quality search
const embedder = new OpenAIEmbeddingFunction({
openai_api_key: process.env.OPENAI_API_KEY,
openai_model: "text-embedding-3-small"
});
const collection = await client.getOrCreateCollection({
name: "support_docs",
embeddingFunction: embedder
});
// Add documents with metadata for filtering
await collection.add({
ids: ["doc1", "doc2", "doc3"],
metadatas: [
{ category: "billing", difficulty: "easy" },
{ category: "technical", difficulty: "hard" },
{ category: "billing", difficulty: "medium" }
],
documents: [
"To update your payment method, go to Settings > Billing > Payment Methods",
"API rate limits are enforced per-minute at the account level using a token bucket algorithm",
"Invoices are generated on the 1st of each month and sent to your billing email"
]
});Querying with Semantic Search + Metadata Filters
// Find billing-related docs similar to user's question
const results = await collection.query({
queryTexts: ["how do I change my credit card?"],
nResults: 3,
where: { category: "billing" }, // Filter by metadata
include: ["documents", "metadatas", "distances"]
});
// results.documents[0] = array of matched document texts
// results.distances[0] = similarity scores (lower = more similar)
console.log(results.documents[0]);4. Integration with LangChain
ChromaDB integrates seamlessly with LangChain for RAG pipelines:
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
# Load and chunk your documents
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(documents)
# Create vector store from documents
vectorstore = Chroma.from_documents(
documents=chunks,
embedding=OpenAIEmbeddings(),
persist_directory="./chroma_db"
)
# Use as a retriever in your RAG chain
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
rag_chain = retriever | prompt | llm | StrOutputParser()5. ChromaDB vs. Alternatives
| Feature | ChromaDB | Pinecone | Weaviate |
|---|---|---|---|
| Cost | Free / Open Source | $70+/mo managed | Free / Open Source |
| Setup Time | 5 minutes | 10 minutes | 30 minutes |
| Max Vectors | ~10M local | 100M+ managed | Unlimited |
| Embedded Mode | ✅ Yes | ❌ No | ❌ No |
| Managed Cloud | ✅ (Chroma Cloud) | ✅ Yes | ✅ Yes |
| LangChain Support | ✅ First-class | ✅ First-class | ✅ First-class |
Troubleshooting Common Issues
Issue: "Collection not found" after restart
Cause: Using in-memory client (chromadb.Client()) which doesn't persist data.
Fix: Switch to chromadb.PersistentClient(path="./chroma_data") to persist data between sessions.
Issue: Slow query performance on large collections
Cause: ChromaDB uses HNSW index which needs tuning for large datasets.
Fix: Configure HNSW parameters when creating collections:
collection = client.create_collection(
name="large_collection",
metadata={"hnsw:space": "cosine", "hnsw:M": 32} # Higher M = better recall
)Issue: High memory usage with many documents
Cause: ChromaDB loads the entire HNSW index into RAM.
Fix: For collections above 1M vectors, consider switching to Qdrant or Weaviate which support disk-based indices. For smaller collections, add more RAM or reduce embedding dimensions (use text-embedding-3-small instead of text-embedding-3-large).
Frequently Asked Questions
Can ChromaDB handle production workloads?
Yes, for collections under 5-10 million vectors with moderate query rates (<100 QPS). For higher scale, use the managed Chroma Cloud or migrate to Pinecone/Qdrant. Many production RAG apps use ChromaDB successfully with careful capacity planning.
How do I update existing documents?
Use collection.update() with the document's ID. ChromaDB will re-embed and update the index automatically:
await collection.update({
ids: ["doc1"],
documents: ["Updated: To change payment, go to Account > Billing"]
});What's the difference between cosine, L2, and dot product distance?
Cosine (recommended for text): Measures angle between vectors, ignores magnitude. Best for semantic similarity.L2 (Euclidean): Measures absolute distance. Good for image embeddings.Dot Product: Fast but requires normalized vectors. Use for maximum-inner-product search (MIPS). For text RAG applications, always use cosine.
How many documents can one collection hold?
ChromaDB's HNSW index supports millions of vectors, but practical limits depend on your server's RAM. Each 1536-dimension embedding (OpenAI's default) uses ~6KB of memory. A collection with 1 million documents needs roughly 6GB RAM just for the index.
Next Steps
Now that you understand ChromaDB, here's your learning path:
- Build a PDF Q&A Bot: Combine ChromaDB with PyPDF2 and LangChain to create a knowledge base from your own documents.
- Explore Metadata Filtering: Practice building hybrid search that combines semantic similarity with exact metadata filters.
- Set Up Persistent Storage: Deploy ChromaDB in Docker with volume mounting and connect your LangChain app to it.
- Compare with Qdrant: For production at scale, explore Qdrant as a drop-in ChromaDB replacement with better performance characteristics.
Continue Reading
Vivek
AI EngineerFull-stack AI engineer with 4+ years building LLM-powered products, autonomous agents, and RAG pipelines. I've shipped AI features to production for startups and worked hands-on with GPT-4o, LangChain, LlamaIndex, and the Vercel AI SDK. I started OpnCrafter to share everything I wish I had when learning — no fluff, just working code and real-world context.