⏱ 7–10 min read🎓 Beginner → IntermediateUpdated Apr 2026

Local Memory: Getting Started with ChromaDB

Dec 29, 2025 • 10 min read

Not every project needs a managed cloud database like Pinecone. For local development, privacy-focused apps, or rapid prototyping, ChromaDB is the gold standard open-source vector database. It's fast to set up, free to use, and surprisingly capable for production workloads under 10 million vectors.

What is a Vector Database and Why Do You Need One?

Before diving into ChromaDB specifically, let's understand the problem it solves. Traditional databases store and retrieve data by exact value matching—find all rows where name = "Alice". This works perfectly for structured data, but completely breaks down for AI applications where you need to find semantically similar content.

Imagine you have a knowledge base of 10,000 support articles. A user asks "my payment isn't going through." You can't exact-match this—the article that helps them might be titled "Resolving Credit Card Authorization Failures." A vector database stores documents as mathematical vectors (embeddings) that capture meaning, allowing you to find the most semantically similar results regardless of exact word choice.

This is the foundation of every RAG (Retrieval-Augmented Generation) system. ChromaDB handles the storage, indexing, and retrieval of these vectors so you can focus on building your AI application.

1. Why ChromaDB?

ChromaDB was purpose-built for AI developers. Here's what makes it stand out from generic databases:

Open Source (Apache 2.0): No usage fees, no vendor lock-in. Run it anywhere—Docker, Kubernetes, or embedded in your Python process.
Embedded Mode: ChromaDB can run inside your Python script without a separate server process. Perfect for local development and testing.
Batteries-Included Embeddings: Ships with all-MiniLM-L6-v2 so you can test without an OpenAI API key.
LangChain & LlamaIndex Integration: Drop-in compatible with both major AI frameworks—switch from in-memory to ChromaDB with one line change.
Hybrid Search: Supports both semantic (vector) and metadata filtering in a single query, unlike simpler solutions.
JavaScript & Python SDKs: Full-featured clients for both ecosystems with identical APIs.

2. Setup: Three Ways to Run ChromaDB

Option A: Embedded (Simplest — Great for Development)

No server needed. Chroma runs inside your Python process and stores data on disk:

import chromadb

# Data persists to ./chroma_data directory
client = chromadb.PersistentClient(path="./chroma_data")

Option B: Docker Server (For Production)

# Start ChromaDB server
docker run -p 8000:8000 -v ./chroma_data:/chroma/chroma chromadb/chroma

# Connect from your app
import chromadb
client = chromadb.HttpClient(host="localhost", port=8000)

Option C: JavaScript Client

npm install chromadb chromadb-default-embed

import { ChromaClient } from 'chromadb';
const client = new ChromaClient({ path: "http://localhost:8000" });

3. Core Operations

Creating Collections and Adding Documents

import { ChromaClient, OpenAIEmbeddingFunction } from 'chromadb';

const client = new ChromaClient();

// Use OpenAI embeddings for production-quality search
const embedder = new OpenAIEmbeddingFunction({
  openai_api_key: process.env.OPENAI_API_KEY,
  openai_model: "text-embedding-3-small"
});

const collection = await client.getOrCreateCollection({
  name: "support_docs",
  embeddingFunction: embedder
});

// Add documents with metadata for filtering
await collection.add({
  ids: ["doc1", "doc2", "doc3"],
  metadatas: [
    { category: "billing", difficulty: "easy" },
    { category: "technical", difficulty: "hard" },
    { category: "billing", difficulty: "medium" }
  ],
  documents: [
    "To update your payment method, go to Settings > Billing > Payment Methods",
    "API rate limits are enforced per-minute at the account level using a token bucket algorithm",
    "Invoices are generated on the 1st of each month and sent to your billing email"
  ]
});

Querying with Semantic Search + Metadata Filters

// Find billing-related docs similar to user's question
const results = await collection.query({
  queryTexts: ["how do I change my credit card?"],
  nResults: 3,
  where: { category: "billing" },  // Filter by metadata
  include: ["documents", "metadatas", "distances"]
});

// results.documents[0] = array of matched document texts
// results.distances[0] = similarity scores (lower = more similar)
console.log(results.documents[0]);

4. Integration with LangChain

ChromaDB integrates seamlessly with LangChain for RAG pipelines:

from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Load and chunk your documents
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(documents)

# Create vector store from documents
vectorstore = Chroma.from_documents(
    documents=chunks,
    embedding=OpenAIEmbeddings(),
    persist_directory="./chroma_db"
)

# Use as a retriever in your RAG chain
retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
rag_chain = retriever | prompt | llm | StrOutputParser()

5. ChromaDB vs. Alternatives

Feature	ChromaDB	Pinecone	Weaviate
Cost	Free / Open Source	$70+/mo managed	Free / Open Source
Setup Time	5 minutes	10 minutes	30 minutes
Max Vectors	~10M local	100M+ managed	Unlimited
Embedded Mode	✅ Yes	❌ No	❌ No
Managed Cloud	✅ (Chroma Cloud)	✅ Yes	✅ Yes
LangChain Support	✅ First-class	✅ First-class	✅ First-class

Troubleshooting Common Issues

Issue: "Collection not found" after restart

Cause: Using in-memory client (chromadb.Client()) which doesn't persist data.

Fix: Switch to chromadb.PersistentClient(path="./chroma_data") to persist data between sessions.

Issue: Slow query performance on large collections

Cause: ChromaDB uses HNSW index which needs tuning for large datasets.

Fix: Configure HNSW parameters when creating collections:

collection = client.create_collection(
    name="large_collection",
    metadata={"hnsw:space": "cosine", "hnsw:M": 32}  # Higher M = better recall
)

Issue: High memory usage with many documents

Cause: ChromaDB loads the entire HNSW index into RAM.

Fix: For collections above 1M vectors, consider switching to Qdrant or Weaviate which support disk-based indices. For smaller collections, add more RAM or reduce embedding dimensions (use text-embedding-3-small instead of text-embedding-3-large).

Frequently Asked Questions

Can ChromaDB handle production workloads?

Yes, for collections under 5-10 million vectors with moderate query rates (<100 QPS). For higher scale, use the managed Chroma Cloud or migrate to Pinecone/Qdrant. Many production RAG apps use ChromaDB successfully with careful capacity planning.

How do I update existing documents?

Use collection.update() with the document's ID. ChromaDB will re-embed and update the index automatically:

await collection.update({
  ids: ["doc1"],
  documents: ["Updated: To change payment, go to Account > Billing"]
});

What's the difference between cosine, L2, and dot product distance?

Cosine (recommended for text): Measures angle between vectors, ignores magnitude. Best for semantic similarity.L2 (Euclidean): Measures absolute distance. Good for image embeddings.Dot Product: Fast but requires normalized vectors. Use for maximum-inner-product search (MIPS). For text RAG applications, always use cosine.

How many documents can one collection hold?

ChromaDB's HNSW index supports millions of vectors, but practical limits depend on your server's RAM. Each 1536-dimension embedding (OpenAI's default) uses ~6KB of memory. A collection with 1 million documents needs roughly 6GB RAM just for the index.

Next Steps

Now that you understand ChromaDB, here's your learning path:

Build a PDF Q&A Bot: Combine ChromaDB with PyPDF2 and LangChain to create a knowledge base from your own documents.
Explore Metadata Filtering: Practice building hybrid search that combines semantic similarity with exact metadata filters.
Set Up Persistent Storage: Deploy ChromaDB in Docker with volume mounting and connect your LangChain app to it.
Compare with Qdrant: For production at scale, explore Qdrant as a drop-in ChromaDB replacement with better performance characteristics.

Continue Reading

👨‍💻

Written by

Vivek

AI Engineer

Full-stack AI engineer with 4+ years building LLM-powered products, autonomous agents, and RAG pipelines. I've shipped AI features to production for startups and worked hands-on with GPT-4o, LangChain, LlamaIndex, and the Vercel AI SDK. I started OpnCrafter to share everything I wish I had when learning — no fluff, just working code and real-world context.

GPT-4oLangChainNext.jsVector DBsRAGVercel AI SDK

More about me →GitHub ↗Contact