Part IV: Training and Adapting

"Give me a lever long enough and a fulcrum on which to place it, and I shall move the world."
Archimedes

Part Overview

Part IV is the heart of the book for practitioners who want to customize LLMs. You will learn to generate synthetic training data, fine-tune models (full and parameter-efficient), distill large models into smaller ones, merge model weights, and align models with human preferences via RLHF and DPO. This is the most technically dense part of the book; take your time with the labs.

Chapters: 5 (Chapters 13 through 17). Builds on API and prompting skills from Part III and supplies the trained models used in Part V and beyond.

Big Picture

Off-the-shelf models only get you so far. Part IV teaches you to bend LLMs to your needs through synthetic data, fine-tuning, distillation, and alignment, turning general-purpose models into specialized tools you can trust.

Chapter 13 Synthetic Data Generation and LLM Simulation

Using LLMs to generate training data: Self-Instruct, Evol-Instruct, persona-driven generation, quality assurance, LLM-assisted labeling, weak supervision, and avoiding model collapse.

Chapter 14 Fine-Tuning Fundamentals

End-to-end fine-tuning: data preparation, training configuration, hyperparameter selection, monitoring, evaluation, and when fine-tuning is (and is not) the right approach.

Chapter 15 Parameter-Efficient Fine-Tuning (PEFT)

Adapting LLMs without updating all parameters: LoRA, QLoRA, adapters, prefix tuning, and prompt tuning. Practical guidance on choosing methods and managing multiple adapters.

Chapter 16 Knowledge Distillation and Model Merging

Creating smaller, faster models: knowledge distillation from teacher to student, model merging techniques (TIES, SLERP, DARE), and practical recipes for building optimized models.

Chapter 17 Alignment: RLHF, DPO and Preference Tuning

Making models helpful, harmless, and honest: reward modeling, RLHF with PPO, Direct Preference Optimization (DPO), constitutional AI, and the alignment research frontier.

What Comes Next

Continue to Part V: Retrieval and Conversation.