Sixteenth Edition · 2026

Building Language AI From Tokens to Agents

A practitioner's guide to large language models, retrieval-augmented generation, fine-tuning, and agentic systems.

Alexander (Sasha) Apartsin, Ph.D. & Yehudit Aperstein, Ph.D.

Language is the interface through which intelligence becomes useful. This book is one connected journey through the ideas, models, code, libraries, worked examples, engineering practices, and tools that let machines understand, generate, and reason through language. It runs from tokens and the mathematics of attention through the training, multimodal, and retrieval-augmented LLMs, into autonomous agents that reason and use tools, and on to the evaluation, safety, and governance of real systems.

15 parts 79 chapters 475 sections 6 appendices & a capstone

The Fifteen-Part Arc

Each part stands on the one before it; together they carry you from a single token to an autonomous agent.

LLM Building Blocks

PyTorch and transformers from scratch: tensors and gradients, tokenization, attention, a working transformer block, and how a next-token distribution becomes text.

6 chapters · 39 sections II

Understanding LLMs

What models are and why they work: pretraining and scaling laws, the open-vs-closed landscape, reasoning models, efficient inference, and interpretability.

5 chapters · 45 sections III

Working with LLMs

Using LLMs in practice: provider APIs as the primary interface, prompt engineering as a measurable craft, and when classical ML beats an LLM call.

4 chapters · 20 sections IV

LLM Training and Adaptation

Make a model your own: synthetic data, supervised fine-tuning, LoRA and QLoRA, distillation and merging, then alignment with RLHF, RLAIF, and DPO.

5 chapters · 45 sections V

Multimodal LLMs

Beyond text: vision-language models, speech and music, 3D and video, and vision-language-action models for LLM-powered robotics.

6 chapters · 52 sections VI

Agentic AI

Agents that act: the ReAct loop, function calling and the MCP, A2A, and AG-UI protocols, multi-agent coordination, and specialized agents.

5 chapters · 28 sections VII

Retrieval & Information Extraction

Grounding answers in your data: embeddings and vector search, full RAG pipelines, multimodal and graph RAG, and structured information extraction.

6 chapters · 36 sections VIII

Conversational AI with LLMs

Systems that talk: dialogue, memory, and retrieval combined into assistants, plus voice and realtime multimodal agents over streaming audio.

4 chapters · 25 sections IX

Evaluation & Observability

Knowing it works: quality metrics, RAG and agent evaluation, LLM-as-judge ensembles, and online observability with distributed tracing.

5 chapters · 34 sections X

Security & Runtime Safety

Holding up under attack: adversarial attacks and red-teaming, runtime guardrails, agent threat models, and privacy, memorization, and unlearning.

5 chapters · 23 sections XI

Ethics, Trust & Governance

Responsible by design: bias and hallucination mitigation, the EU AI Act and NIST RMF, watermarking and provenance, and carbon accounting.

6 chapters · 25 sections XII

LLM Systems at Scale

Scaling the system: compute planning, distributed training, non-NVIDIA silicon and decentralized training, and edge and on-device deployment.

5 chapters · 22 sections XIII

LLMOps Lifecycle

Running it in production: AI gateways and routing, durable workflow orchestration, Kubernetes-native deployment, SLOs, and a model registry.

5 chapters · 17 sections XIV

Applications Across Industries

LLMs at work: deep dives across legal, finance, healthcare, education, security, government, manufacturing, and the creative industries.

8 chapters · 45 sections XV

Research Frontiers

Where it is heading: frontier architectures, theories of reasoning and agency, interpretability at scale, and the road toward AGI.

4 chapters · 19 sections

How This Book Teaches

Six habits, kept in every chapter from the first token to the last agent.

From Scratch, Then Frameworks

Attention, a transformer block, a RAG pipeline, an agent loop: you implement the core before reaching for PyTorch, Hugging Face, or LangChain, so the abstractions never feel like magic.

Worked Pipelines

Code appears as complete, runnable pipelines rather than disconnected snippets, with the inputs, outputs, and failure modes you actually hit.

A Callout System

A consistent set of callouts flags definitions, common pitfalls, and "in production" caveats, so you can skim for exactly the layer you need.

Exercises & Labs

Each chapter closes with exercises and hands-on labs that turn the reading into working code and measurable results.

Reading Pathways

Eight goal-based reading pathways and five course syllabi route you to the chapters that matter for your goal.

Classical ML Returns

Decision frameworks show where a classical ML pipeline still beats an LLM call, and how to combine the two, so you build the right system rather than the trendy one.

The Hands-On AI Science Series

Building Language AI is one of four connected books, each a deep, build-it-yourself guide to a major field of AI.

Hands-On AI Science is a series of in-depth guides to the major fields of artificial intelligence. Every book goes deep into the theory, models, and internals, covering the classical foundations and the most recent ideas, then shows you how to build each one in Python with the modern libraries and tools that get the job done. The writing stays plain and light (illustrations, analogies, mental models, worked examples, and a little fun) without trading away rigor or coverage. Each volume is self-contained and complete enough to anchor a full course on its subject.

Building Language AI

From Tokens to Agents.

You are here

Building Vision AI

From Pixels to Generative Models.

Read online · Kindle

Building Temporal AI

From Forecasting to Sequential Decision Making.

Read online · Kindle

Building Scalable AI

From Big Data Algorithms to Distributed Intelligence.

Read online · Kindle

Read the full About the Hands-On AI Science Series note.