Part II

Part II: Understanding LLMs

How large language models are trained, how they differ, and how to serve them efficiently at scale.

Chapter opener illustration: Part II: Understanding LLMs.

Part Overview

Part II takes you inside the black box. You will learn how LLMs are pretrained on massive corpora, what scaling laws predict about model performance, and how modern frontier and open-weight families differ in their design choices. The part also covers reasoning models that trade more inference-time compute for higher quality, inference optimization techniques that make models practical in production, and the mechanistic interpretability work that has begun to reveal what a trained transformer actually computes. It closes with a Tools of the Trade chapter on the 2026 model zoo and the tokenizer/interpretability stack.

Chapters: 5 (Chapters 6 through 10). Builds directly on the Transformer foundations from Part I and prepares you for hands-on LLM work in Part III.

Big Picture

Before you can effectively use or customize an LLM, you need to understand how it was built and how it runs. Part II reveals the training recipes, architectural trade-offs, reasoning-model designs, serving strategies, and interpretability methods that determine what a model can do and how much it costs to run.

What's Next?

This part begins with Chapter 6: Pretraining, Scaling Laws & Data Curation. Each chapter builds on the previous one, so we recommend reading Part II in order.