PydanticAI: Production-Grade Agents
Dec 30, 2025 • 18 min read
The AI agent framework landscape is crowded — LangChain, AutoGen, CrewAI, LlamaIndex all compete for mind-share. Many suffer the same problems: excessive abstraction that hides what's actually happening, weak type safety that makes debugging LLM JSON failures a nightmare, and global state that's a liability in production multi-user environments. PydanticAI, built by the same team that created Pydantic, solves all three: it's type-safe from the ground up, uses Python's dependency injection patterns instead of global state, and stays close enough to raw Python that you always know what's happening.
1. PydanticAI vs Other Frameworks
| Aspect | LangChain | PydanticAI |
|---|---|---|
| Type safety | Minimal — dicts everywhere | Full Pydantic models, mypy compatible |
| Tool definition | JSON schema manually or via decorator | Type hints auto-generate JSON schema |
| Structured output | Output parsers (frequently fails) | Native Pydantic validation with retry |
| Dependencies | Global variables or closures | Explicit RunContext dependency injection |
| Testing | Mocking LLM calls is complex | Built-in TestModel for deterministic tests |
| Learning curve | High — many abstractions to learn | Low — feels like regular Python code |
2. Core Concepts
pip install pydantic-ai
from pydantic_ai import Agent
from pydantic import BaseModel
# Structured output: the LLM MUST return valid JSON matching this schema
class MovieReview(BaseModel):
title: str
year: int
rating: float # 0.0 to 10.0
pros: list[str]
cons: list[str]
verdict: str
# Agent with typed output — auto-generates JSON schema instructions
agent = Agent(
"openai:gpt-4o",
result_type=MovieReview, # ← LLM output is validated as MovieReview
system_prompt="You are a professional film critic.",
)
result = agent.run_sync("Review Inception (2010)")
movie = result.data # Type: MovieReview (not dict, not str)
print(f"{movie.title} ({movie.year}): {movie.rating}/10")
print(f"Pros: {', '.join(movie.pros)}")
# PydanticAI automatically retries if LLM returns invalid JSON
# Includes the validation error in the next prompt so LLM can self-correct3. Dependency Injection with RunContext
from pydantic_ai import Agent, RunContext
from dataclasses import dataclass
import asyncio
# Define the dependencies your agent needs
@dataclass
class AgentDeps:
user_id: str
db_connection: object # Your database connection/session
api_key: str
# Create agent with typed dependencies
support_agent = Agent(
"openai:gpt-4o",
deps_type=AgentDeps,
system_prompt="You are a customer support agent. Be concise and helpful.",
)
# Dynamic system prompt based on dependencies
@support_agent.system_prompt
async def get_system_prompt(ctx: RunContext[AgentDeps]) -> str:
user = await ctx.deps.db_connection.users.get(ctx.deps.user_id)
plan = user.subscription_plan
return f"""You are a support agent for {user.name}.
Their subscription plan is: {plan}.
{'They have priority support.' if plan == 'enterprise' else ''}"""
# Tools receive context — can access dependencies
@support_agent.tool
async def get_order_history(ctx: RunContext[AgentDeps]) -> list[dict]:
"""Fetch the user's recent order history."""
orders = await ctx.deps.db_connection.orders.get_by_user(ctx.deps.user_id)
return [{"id": o.id, "date": o.date.isoformat(), "total": o.total} for o in orders]
@support_agent.tool
async def create_refund(
ctx: RunContext[AgentDeps],
order_id: str,
reason: str,
) -> str:
"""Process a refund for a specific order."""
result = await ctx.deps.db_connection.refunds.create(
user_id=ctx.deps.user_id,
order_id=order_id,
reason=reason,
)
return f"Refund {result.id} created successfully. Will process in 3-5 days."
# Run with per-request dependencies — no global state!
async def handle_support(user_id: str, message: str):
deps = AgentDeps(
user_id=user_id,
db_connection=await get_db(), # Fresh connection per request
api_key=get_user_api_key(user_id),
)
result = await support_agent.run(message, deps=deps)
return result.data4. Testing with TestModel
from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel # Built-in test model!
import pytest
# Same agent as production — just swap the model
database_agent = Agent("openai:gpt-4o", result_type=str)
@database_agent.tool
def query_database(sql: str) -> str:
"""Execute a database query and return results."""
return execute_sql(sql) # Real implementation
# Test: verify the tool is called with expected SQL
async def test_database_agent_calls_correct_tool():
with TestModel() as model:
# TestModel intercepts LLM calls — no API cost, deterministic
model.call_tools(["query_database"]) # Simulate LLM deciding to call this tool
result = await database_agent.run(
"How many users signed up last month?",
model=model,
)
# Assert that the tool was called
assert model.last_tool_call.name == "query_database"
assert "SELECT" in model.last_tool_call.args["sql"]
# Assert result type is correct
assert isinstance(result.data, str)5. FastAPI Integration
from fastapi import FastAPI, Depends, HTTPException
from pydantic_ai import Agent, RunContext
from pydantic import BaseModel
from dataclasses import dataclass
app = FastAPI()
@dataclass
class RequestDeps:
user_id: str
db: Database
class ChatRequest(BaseModel):
message: str
class ChatResponse(BaseModel):
reply: str
tokens_used: int
agent = Agent("openai:gpt-4o", deps_type=RequestDeps)
@agent.tool
async def search_knowledge_base(ctx: RunContext[RequestDeps], query: str) -> str:
"""Search the product knowledge base."""
results = await ctx.deps.db.search(query)
return "
".join(r.content for r in results)
@app.post("/chat/{user_id}")
async def chat_endpoint(
user_id: str,
request: ChatRequest,
db: Database = Depends(get_database),
):
deps = RequestDeps(user_id=user_id, db=db)
try:
result = await agent.run(request.message, deps=deps)
return ChatResponse(
reply=result.data,
tokens_used=result.usage().total_tokens,
)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
# Streaming endpoint for real-time chat
@app.post("/chat/{user_id}/stream")
async def stream_chat(user_id: str, request: ChatRequest, db: Database = Depends(get_database)):
from fastapi.responses import StreamingResponse
async def generate():
deps = RequestDeps(user_id=user_id, db=db)
async with agent.run_stream(request.message, deps=deps) as streamed_result:
async for text in streamed_result.stream_text():
yield f"data: {text}
"
return StreamingResponse(generate(), media_type="text/event-stream")Frequently Asked Questions
How does PydanticAI compare to the OpenAI Structured Outputs feature?
OpenAI's Structured Outputs (using response_format={"type": "json_schema"}) guarantees valid JSON matching your schema at the model level. PydanticAI uses it when available (for GPT-4o and newer models), but adds an application-level retry loop with validation error feedback for models that don't support it. PydanticAI also adds the crucial pieces OpenAI's API doesn't provide: tool calling, dependency injection, conversation history management, and multi-model support.
Can PydanticAI agents use multiple LLM providers?
Yes — PydanticAI supports OpenAI, Anthropic, Gemini, Groq, Mistral, Ollama, and any OpenAI-compatible API. You can use different models for different agents in the same application, or switch models via environment variable (PYDANTIC_AI_MODEL). This makes it easy to use Claude for creative/analysis tasks and GPT-4o-mini for high-volume classification tasks in the same pipeline.
Conclusion
PydanticAI's opinionated design — type-safe from agent definition to tool output, explicit dependency injection instead of global state, built-in test model for unit testing — makes it the most production-ready Python agent framework available. If you've been frustrated with LangChain's abstraction complexity or Pydantic's absence from other frameworks, PydanticAI fills the gap with code that feels like idiomatic Python but handles all the LLM-specifics (retry on validation failure, streaming, tool call loops) transparently.
Continue Reading
Vivek
AI EngineerFull-stack AI engineer with 4+ years building LLM-powered products, autonomous agents, and RAG pipelines. I've shipped AI features to production for startups and worked hands-on with GPT-4o, LangChain, LlamaIndex, and the Vercel AI SDK. I started OpnCrafter to share everything I wish I had when learning — no fluff, just working code and real-world context.