
A practitioner's guide to large language models, retrieval-augmented generation, fine-tuning, and agentic systems.
Language is the interface through which intelligence becomes useful. This book is one connected journey through the ideas, models, code, libraries, worked examples, engineering practices, and tools that let machines understand, generate, and reason through language. It runs from tokens and the mathematics of attention through the training, multimodal, and retrieval-augmented LLMs, into autonomous agents that reason and use tools, and on to the evaluation, safety, and governance of real systems.
Each part stands on the one before it; together they carry you from a single token to an autonomous agent.
PyTorch and transformers from scratch: tensors and gradients, tokenization, attention, a working transformer block, and how a next-token distribution becomes text.
6 chapters · 39 sections IIWhat models are and why they work: pretraining and scaling laws, the open-vs-closed landscape, reasoning models, efficient inference, and interpretability.
5 chapters · 45 sections IIIUsing LLMs in practice: provider APIs as the primary interface, prompt engineering as a measurable craft, and when classical ML beats an LLM call.
4 chapters · 20 sections IVMake a model your own: synthetic data, supervised fine-tuning, LoRA and QLoRA, distillation and merging, then alignment with RLHF, RLAIF, and DPO.
5 chapters · 45 sections VBeyond text: vision-language models, speech and music, 3D and video, and vision-language-action models for LLM-powered robotics.
6 chapters · 52 sections VIAgents that act: the ReAct loop, function calling and the MCP, A2A, and AG-UI protocols, multi-agent coordination, and specialized agents.
5 chapters · 28 sections VIIGrounding answers in your data: embeddings and vector search, full RAG pipelines, multimodal and graph RAG, and structured information extraction.
6 chapters · 36 sections VIIISystems that talk: dialogue, memory, and retrieval combined into assistants, plus voice and realtime multimodal agents over streaming audio.
4 chapters · 25 sections IXKnowing it works: quality metrics, RAG and agent evaluation, LLM-as-judge ensembles, and online observability with distributed tracing.
5 chapters · 34 sections XHolding up under attack: adversarial attacks and red-teaming, runtime guardrails, agent threat models, and privacy, memorization, and unlearning.
5 chapters · 23 sections XIResponsible by design: bias and hallucination mitigation, the EU AI Act and NIST RMF, watermarking and provenance, and carbon accounting.
6 chapters · 25 sections XIIScaling the system: compute planning, distributed training, non-NVIDIA silicon and decentralized training, and edge and on-device deployment.
5 chapters · 22 sections XIIIRunning it in production: AI gateways and routing, durable workflow orchestration, Kubernetes-native deployment, SLOs, and a model registry.
5 chapters · 17 sections XIVLLMs at work: deep dives across legal, finance, healthcare, education, security, government, manufacturing, and the creative industries.
8 chapters · 45 sections XVWhere it is heading: frontier architectures, theories of reasoning and agency, interpretability at scale, and the road toward AGI.
4 chapters · 19 sectionsSix habits, kept in every chapter from the first token to the last agent.
Attention, a transformer block, a RAG pipeline, an agent loop: you implement the core before reaching for PyTorch, Hugging Face, or LangChain, so the abstractions never feel like magic.
Code appears as complete, runnable pipelines rather than disconnected snippets, with the inputs, outputs, and failure modes you actually hit.
A consistent set of callouts flags definitions, common pitfalls, and "in production" caveats, so you can skim for exactly the layer you need.
Each chapter closes with exercises and hands-on labs that turn the reading into working code and measurable results.
Eight goal-based reading pathways and five course syllabi route you to the chapters that matter for your goal.
Decision frameworks show where a classical ML pipeline still beats an LLM call, and how to combine the two, so you build the right system rather than the trendy one.
Building Language AI is one of four connected books, each a deep, build-it-yourself guide to a major field of AI.
Hands-On AI Science is a series of in-depth guides to the major fields of artificial intelligence. Every book goes deep into the theory, models, and internals, covering the classical foundations and the most recent ideas, then shows you how to build each one in Python with the modern libraries and tools that get the job done. The writing stays plain and light (illustrations, analogies, mental models, worked examples, and a little fun) without trading away rigor or coverage. Each volume is self-contained and complete enough to anchor a full course on its subject.
From Tokens to Agents.
You are hereRead the full About the Hands-On AI Science Series note.