AI Engineering
The practical skills that separate AI hobbyists from AI engineers.
Knowing how to call the OpenAI API is the beginning. Building a production AI system — one that runs reliably, costs predictably, survives load spikes, and improves over time — requires a broader skill set: prompt engineering, local model deployment, fine-tuning, evaluation, observability, and cloud deployment. This track covers all of it.
I start with prompt engineering beyond the basics — Chain-of-Thought, ReAct prompting, few-shot structuring, and how to systematically test prompt changes. Then move to local deployment with Ollama (run Llama 3.1 on your laptop), quantization to fit large models on consumer hardware, fine-tuning with LoRA so your model learns your specific domain, and AI evaluation metrics (Ragas, TruLens, LLM-as-a-Judge).
The engineering track ends with production deployment: Dockerizing your agent, serverless deployment on Lambda and Vercel Edge, and a complete project — deploying a production agent to AWS ECS Fargate with auto-scaling. These are the patterns I've used to take AI features from prototype to production.
📚 Learning Path
- Advanced prompt engineering (CoT, ReAct)
- Local LLMs with Ollama and quantization
- Fine-tuning with LoRA and PEFT
- AI evaluation: Ragas, TruLens, LLM-as-judge
- Dockerizing and deploying agents to AWS ECS