opncrafter

PydanticAI: Production-Grade Agents

Dec 30, 2025 • 18 min read

The AI agent framework landscape is crowded — LangChain, AutoGen, CrewAI, LlamaIndex all compete for mind-share. Many suffer the same problems: excessive abstraction that hides what's actually happening, weak type safety that makes debugging LLM JSON failures a nightmare, and global state that's a liability in production multi-user environments. PydanticAI, built by the same team that created Pydantic, solves all three: it's type-safe from the ground up, uses Python's dependency injection patterns instead of global state, and stays close enough to raw Python that you always know what's happening.

1. PydanticAI vs Other Frameworks

AspectLangChainPydanticAI
Type safetyMinimal — dicts everywhereFull Pydantic models, mypy compatible
Tool definitionJSON schema manually or via decoratorType hints auto-generate JSON schema
Structured outputOutput parsers (frequently fails)Native Pydantic validation with retry
DependenciesGlobal variables or closuresExplicit RunContext dependency injection
TestingMocking LLM calls is complexBuilt-in TestModel for deterministic tests
Learning curveHigh — many abstractions to learnLow — feels like regular Python code

2. Core Concepts

pip install pydantic-ai

from pydantic_ai import Agent
from pydantic import BaseModel

# Structured output: the LLM MUST return valid JSON matching this schema
class MovieReview(BaseModel):
    title: str
    year: int
    rating: float  # 0.0 to 10.0
    pros: list[str]
    cons: list[str]
    verdict: str

# Agent with typed output — auto-generates JSON schema instructions
agent = Agent(
    "openai:gpt-4o",
    result_type=MovieReview,   # ← LLM output is validated as MovieReview
    system_prompt="You are a professional film critic.",
)

result = agent.run_sync("Review Inception (2010)")
movie = result.data  # Type: MovieReview (not dict, not str)
print(f"{movie.title} ({movie.year}): {movie.rating}/10")
print(f"Pros: {', '.join(movie.pros)}")

# PydanticAI automatically retries if LLM returns invalid JSON
# Includes the validation error in the next prompt so LLM can self-correct

3. Dependency Injection with RunContext

from pydantic_ai import Agent, RunContext
from dataclasses import dataclass
import asyncio

# Define the dependencies your agent needs
@dataclass
class AgentDeps:
    user_id: str
    db_connection: object  # Your database connection/session
    api_key: str

# Create agent with typed dependencies
support_agent = Agent(
    "openai:gpt-4o",
    deps_type=AgentDeps,
    system_prompt="You are a customer support agent. Be concise and helpful.",
)

# Dynamic system prompt based on dependencies
@support_agent.system_prompt
async def get_system_prompt(ctx: RunContext[AgentDeps]) -> str:
    user = await ctx.deps.db_connection.users.get(ctx.deps.user_id)
    plan = user.subscription_plan
    return f"""You are a support agent for {user.name}.
Their subscription plan is: {plan}.
{'They have priority support.' if plan == 'enterprise' else ''}"""

# Tools receive context — can access dependencies
@support_agent.tool
async def get_order_history(ctx: RunContext[AgentDeps]) -> list[dict]:
    """Fetch the user's recent order history."""
    orders = await ctx.deps.db_connection.orders.get_by_user(ctx.deps.user_id)
    return [{"id": o.id, "date": o.date.isoformat(), "total": o.total} for o in orders]

@support_agent.tool
async def create_refund(
    ctx: RunContext[AgentDeps],
    order_id: str,
    reason: str,
) -> str:
    """Process a refund for a specific order."""
    result = await ctx.deps.db_connection.refunds.create(
        user_id=ctx.deps.user_id,
        order_id=order_id,
        reason=reason,
    )
    return f"Refund {result.id} created successfully. Will process in 3-5 days."

# Run with per-request dependencies — no global state!
async def handle_support(user_id: str, message: str):
    deps = AgentDeps(
        user_id=user_id,
        db_connection=await get_db(),  # Fresh connection per request
        api_key=get_user_api_key(user_id),
    )
    result = await support_agent.run(message, deps=deps)
    return result.data

4. Testing with TestModel

from pydantic_ai import Agent
from pydantic_ai.models.test import TestModel  # Built-in test model!
import pytest

# Same agent as production — just swap the model
database_agent = Agent("openai:gpt-4o", result_type=str)

@database_agent.tool
def query_database(sql: str) -> str:
    """Execute a database query and return results."""
    return execute_sql(sql)  # Real implementation

# Test: verify the tool is called with expected SQL
async def test_database_agent_calls_correct_tool():
    with TestModel() as model:
        # TestModel intercepts LLM calls — no API cost, deterministic
        model.call_tools(["query_database"])  # Simulate LLM deciding to call this tool
        
        result = await database_agent.run(
            "How many users signed up last month?",
            model=model,
        )
        
        # Assert that the tool was called
        assert model.last_tool_call.name == "query_database"
        assert "SELECT" in model.last_tool_call.args["sql"]
        # Assert result type is correct
        assert isinstance(result.data, str)

5. FastAPI Integration

from fastapi import FastAPI, Depends, HTTPException
from pydantic_ai import Agent, RunContext
from pydantic import BaseModel
from dataclasses import dataclass

app = FastAPI()

@dataclass
class RequestDeps:
    user_id: str
    db: Database

class ChatRequest(BaseModel):
    message: str

class ChatResponse(BaseModel):
    reply: str
    tokens_used: int

agent = Agent("openai:gpt-4o", deps_type=RequestDeps)

@agent.tool
async def search_knowledge_base(ctx: RunContext[RequestDeps], query: str) -> str:
    """Search the product knowledge base."""
    results = await ctx.deps.db.search(query)
    return "
".join(r.content for r in results)

@app.post("/chat/{user_id}")
async def chat_endpoint(
    user_id: str,
    request: ChatRequest,
    db: Database = Depends(get_database),
):
    deps = RequestDeps(user_id=user_id, db=db)
    
    try:
        result = await agent.run(request.message, deps=deps)
        return ChatResponse(
            reply=result.data,
            tokens_used=result.usage().total_tokens,
        )
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

# Streaming endpoint for real-time chat
@app.post("/chat/{user_id}/stream")
async def stream_chat(user_id: str, request: ChatRequest, db: Database = Depends(get_database)):
    from fastapi.responses import StreamingResponse
    
    async def generate():
        deps = RequestDeps(user_id=user_id, db=db)
        async with agent.run_stream(request.message, deps=deps) as streamed_result:
            async for text in streamed_result.stream_text():
                yield f"data: {text}

"
    
    return StreamingResponse(generate(), media_type="text/event-stream")

Frequently Asked Questions

How does PydanticAI compare to the OpenAI Structured Outputs feature?

OpenAI's Structured Outputs (using response_format={"type": "json_schema"}) guarantees valid JSON matching your schema at the model level. PydanticAI uses it when available (for GPT-4o and newer models), but adds an application-level retry loop with validation error feedback for models that don't support it. PydanticAI also adds the crucial pieces OpenAI's API doesn't provide: tool calling, dependency injection, conversation history management, and multi-model support.

Can PydanticAI agents use multiple LLM providers?

Yes — PydanticAI supports OpenAI, Anthropic, Gemini, Groq, Mistral, Ollama, and any OpenAI-compatible API. You can use different models for different agents in the same application, or switch models via environment variable (PYDANTIC_AI_MODEL). This makes it easy to use Claude for creative/analysis tasks and GPT-4o-mini for high-volume classification tasks in the same pipeline.

Conclusion

PydanticAI's opinionated design — type-safe from agent definition to tool output, explicit dependency injection instead of global state, built-in test model for unit testing — makes it the most production-ready Python agent framework available. If you've been frustrated with LangChain's abstraction complexity or Pydantic's absence from other frameworks, PydanticAI fills the gap with code that feels like idiomatic Python but handles all the LLM-specifics (retry on validation failure, streaming, tool call loops) transparently.

Continue Reading

👨‍💻
Written by

Vivek

AI Engineer

Full-stack AI engineer with 4+ years building LLM-powered products, autonomous agents, and RAG pipelines. I've shipped AI features to production for startups and worked hands-on with GPT-4o, LangChain, LlamaIndex, and the Vercel AI SDK. I started OpnCrafter to share everything I wish I had when learning — no fluff, just working code and real-world context.

GPT-4oLangChainNext.jsVector DBsRAGVercel AI SDK