Capstone Project
Capstone

Capstone Project: End-to-End LLM System

Design, build, evaluate, and present a production-grade LLM application that integrates every major skill from this book.

"Reading thirty-five chapters about LLM systems teaches you a vocabulary. Shipping one teaches you the craft."

SageSage, Newly-Graduated AI Agent
Big Picture

The capstone is the integrating exercise for everything in the book: a single production-grade LLM application built end-to-end. You will make architectural decisions that balance competing concerns (quality vs. latency, accuracy vs. cost, flexibility vs. reliability) and produce a working system, a published model and dataset, a technical report, and a 15-minute presentation. This is where you cross from reading about LLM systems to actually shipping one.

Project Overview

The capstone project is the culminating experience of this book. You will design, build, and present a complete LLM-powered system that demonstrates mastery across the full stack: data preparation, model training and adaptation, retrieval-augmented generation, agent orchestration, production deployment, evaluation, and business strategy.

Unlike individual module labs that focus on a single technique, the capstone requires architectural decisions that balance competing concerns: model quality versus latency, accuracy versus cost, flexibility versus reliability. These tradeoffs are what distinguish a classroom exercise from a production system.

You will work on this project over approximately 4 to 6 weeks. The project culminates in a GitHub repository with working code, a model and dataset published on Hugging Face Hub, a written technical report, and a 15-minute presentation.

Key Takeaways: What Makes a Strong Capstone

Learning Objectives

Capstone Pages

Suggested Timeline (6 weeks)

WeekFocus
Week 1Design. Select use case, define requirements, design architecture, identify datasets.
Week 2Data + Model. Prepare synthetic dataset, begin fine-tuning or adapter training.
Week 3RAG + Agent. Build RAG pipeline, implement agent with tools, integrate components.
Week 4Deploy + Evaluate. Deploy to cloud, set up monitoring, run evaluation suite.
Week 5Refine. Address evaluation findings, add safety guardrails, optimize performance.
Week 6Report + Present. Write technical report, prepare presentation, publish artifacts.

Deliverable Summary

  1. GitHub Repository with clean code, README, and deployment instructions.
  2. Hugging Face Hub artifacts: fine-tuned model and curated dataset.
  3. Technical Report (8 to 12 pages) with architecture, evaluation, and limitations.
  4. Interpretability Analysis documenting attention patterns, token attributions, or probing results.
  5. Live Demo (deployed or screencast) showing the system in action.
  6. Presentation (15 minutes) covering motivation, architecture, results, and lessons learned.

What Comes Next

Read Capstone C.1: Requirements & Deliverables for the per-component technical bar and the deliverable specifications. After that, the work is yours.