Prerequisites
DevOps experience (CI/CD, Docker, Kubernetes). Python packaging and testing. Chapters 10 through 11 for API basics. Cloud provider experience (AWS/GCP/Azure).
MLOps / Production Track
Deploying, operating, and maintaining LLM systems in production environments.
Learning Sequence
Follow the numbered steps in order. Each step builds on the previous one to give you a coherent understanding of this topic area.
- Chapter 09: Inference Optimization and Efficient Serving (quantization, KV-cache, speculative decoding)
- Sections 14.1 through 14.3: Fine-Tuning Basics (when and how to fine-tune for your use case)
- Chapter 15: Parameter-Efficient Fine-Tuning (PEFT) (LoRA, QLoRA, adapter merging)
- Chapter 26: Agent Safety and Production Infrastructure (guardrails, sandboxing, agent monitoring)
- Chapter 29: Evaluation and Experiment Design (LLM-as-judge, A/B testing, benchmarks)
- Chapter 30: Observability and Monitoring (drift monitoring, trace analysis, dashboards)
- Chapter 31: Production Engineering and Operations (CI/CD for LLMs, guardrails, cost management)
- Chapter 32: Safety, Ethics and Regulation (red-teaming, compliance, responsible deployment)
- Chapter 34: Emerging Architectures and Scaling Frontiers (mixture of experts, state-space models, and infrastructure implications)
- Chapter 35: AI, Society and Open Problems (AI safety alignment, governance, and operational responsibilities)
Recommended Appendices
- Appendix S: Inference Serving – deploy and serve models with low latency
- Appendix U: Docker and Containers – containerize ML services for reproducible deployments
- Appendix T: Distributed ML – scale training and inference across multiple GPUs
- Appendix R: Experiment Tracking – track experiments and model versions in production
What Comes Next
Return to the Course Syllabi overview to explore other tracks and courses, or proceed to FM.4: How to Use This Book for a quick orientation on conventions and callout types.