Front Matter · Why This Book Exists
8 entriesPart I · LLM Building Blocks
6 chapters · 35 sectionsMath, ML/PyTorch prerequisites, NLP and text representation, tokenization, attention, transformers, decoding.
- 0 ML and PyTorch Foundations PyTorch is the lingua franca of modern LLM engineering.
- 1 Foundations of NLP & Text Representation Every LLM is built on top of representations of text: how you turn words into numbers determines what the model can learn.
- 2 Sequence Models & the Attention Mechanism Attention solves the problem that ended the RNN era: how to let any position in a sequence look at any other position without paying linear cost in the path length.
- 3 The Transformer Architecture The transformer is the architecture every chapter after this assumes you understand.
- 4 Decoding Strategies & Text Generation A trained transformer is a probability distribution over the next token; turning that into useful text requires a decoding strategy.
- 5 Tools of the Trade: Foundations Stack Consolidated reference: platforms, libraries, datasets, models, and external resources for this part.
Part II · Understanding LLMs
5 chapters · 39 sectionsPretraining, scaling laws, modern landscape, reasoning, inference optimization, interpretability.
- 6 Pretraining, Scaling Laws & Data Curation Part I built the language of foundations: tensors, gradients, sequence models, the attention head, the transformer block.
- 7 Modern LLM Landscape & Model Internals The LLM landscape spans a spectrum from closed-source frontier APIs (maximum capability, least control) to open-weight models (full transparency, deployment flexibility).
- 8 Reasoning Models & Test-Time Compute Recent breakthroughs show that LLMs can improve their outputs by "thinking longer" at inference time.
- 9 Inference Optimization & Efficient Serving Even the most capable model is useless if it is too slow or too expensive to serve.
- 10 Interpretability & Mechanistic Understanding As LLMs become more capable, understanding what they have learned and why they produce specific outputs becomes critical.
Part III · Working with LLMs
4 chapters · 20 sectionsLLM APIs, prompt engineering, hybrid ML+LLM application patterns.
- 11 Working with LLM APIs For most practitioners, LLM APIs are the primary interface to model capabilities.
- 12 Prompt Engineering & Advanced Techniques Prompt engineering is the most accessible and often the most cost-effective way to improve LLM output quality.
- 13 Hybrid ML+LLM Architectures & Decision Frameworks Not every problem needs a large language model, and not every LLM output should be trusted without verification.
- 14 Tools of the Trade: LLM API Stack Consolidated reference: platforms, libraries, datasets, models, and external resources for this part.
Part IV · LLM Training and Adaptation
5 chapters · 44 sectionsSynthetic data, supervised fine-tuning, PEFT, RLHF / DPO / preference tuning, training tools.
- 15 Synthetic Data Generation & LLM Simulation Part III walked you through provider APIs, prompt engineering, and hybrid ML+LLM design.
- 16 Fine-Tuning Fundamentals Fine-tuning transforms a general-purpose LLM into a specialist for your domain.
- 17 Parameter-Efficient Fine-Tuning, Distillation & Model Merging Full fine-tuning is expensive and often unnecessary.
- 18 Alignment: RLHF, DPO & Preference Tuning Alignment is what separates a raw language model from a helpful, harmless assistant.
- 19 Tools of the Trade: Training & Adaptation Stack Consolidated reference: platforms, libraries, datasets, models, and external resources for this part.
Part V · Multimodal LLMs
6 chapters · 52 sectionsVision-language & Omni models, image/video/audio generation, document understanding, 3D, embodied AI / VLA / robotics.
- 20 Audio, Music, and Video Generation TTS, voice cloning, music generation, audio editing, and the production stack for synthetic audio.
- 21 Document Understanding and OCR Modern OCR (TrOCR), layout-aware models, VLM-based document understanding, and document AI pipelines.
- 22 Vision-Language and Omni Models ViT, CLIP, SigLIP, BLIP-3, LLaVA, GPT-4V, and the multimodal reasoning landscape.
- 23 3D Generation and Neural Scenes 3D Gaussian Splatting, NeRF, Stable Zero123, Trellis, 4D splats, and scene relighting.
- 24 VLA Models and LLM-Powered Robotics RT-2, OpenVLA, pi-0, action tokenization, cross-embodiment transfer, and VLA limitations.
- 25 Tools of the Trade: Multimodal Stack Consolidated reference: platforms, libraries, datasets, models, and external resources for this part.
Part VI · Agentic AI
5 chapters · 26 sectionsAgent foundations, tool use (MCP / A2A), multi-agent systems, specialized agents.
- 26 AI Agent Foundations Part IV moved from "calling models" to "shaping models": SFT, instruction tuning, RLHF/RLAIF, DPO, and the parameter-efficient methods (LoRA, QLoRA).
- 27 Tool Use, Function Calling & Protocols Agents become truly powerful when they can call external tools: APIs, databases, code interpreters, and more.
- 28 Multi-Agent Systems Complex tasks often exceed what a single agent can handle.
- 29 Specialized Agents While Chapters 26 through 28 cover general agent principles, this chapter focuses on domain-specific agent types: coding assistants, research agents, data analysis agents, and more.
- 30 Tools of the Trade: Agent Stack Consolidated reference: platforms, libraries, datasets, models, and external resources for this part.
Part VII · Retrieval & Information Extraction with LLMs
6 chapters · 34 sectionsEmbeddings, structured information extraction & NER, RAG, knowledge graphs, cross-modal retrieval.
- 31 Embeddings, Vector Databases & Semantic Search Part VI built agents: planning, tool use, multi-agent coordination, memory, and the protocols (MCP, A2A, AG-UI) that let agents talk to tools, to each other, and to users.
- 32 Retrieval-Augmented Generation (RAG) Retrieval-augmented generation is one of the most widely deployed LLM patterns in production.
- 33 Cross-Modal Reasoning and Multimodal RAG Joint embedding spaces, multimodal retrieval, when to retrieve vs reason, and production multimodal reasoning.
- 34 Structured Information Extraction & NER Information extraction landscape, classical and open IE, hybrid LLM architectures, production deployment, coreference resolution and document pipelines.
- 35 Advanced RAG Knowledge graphs, GraphRAG, ingestion pipelines, frameworks and orchestration.
- 36 Retrieval Tools of the Trade Tools of the trade reference.
Part VIII · Conversational AI with LLMs
4 chapters · 24 sectionsDialogue architecture, memory and context management, multi-turn flows, voice and realtime multimodal assistants.
- 37 Building Conversational AI Systems Conversational AI brings together everything from prompt engineering to memory management to retrieval.
- 38 LLM-Powered Recommender Systems From query understanding and item enrichment to conversational and generative recsys (TIGER, LLaRA, P5), with eval and production patterns.
- 39 Voice and Realtime Multimodal Assistants Speech interfaces, streaming audio, realtime APIs.
- 40 Conversational AI Tools of the Trade
Part IX · LLM Evaluation & Observability
5 chapters · 33 sectionsQuality metrics, LLM-as-judge, specialized evaluation, online monitoring, eval tools.
- 42 LLM Evaluation & Quality Metrics You cannot improve what you cannot measure.
- 43 Specialized Evaluation: RAG, Agents, Multimodal, Long-Context Evaluation methodologies for the 2026 frontier: RAG faithfulness, agentic trajectories, simulation-based eval, code-gen pass@k, multimodal grounding, and long-context benchmarks.
- 44 Online Evaluation, Observability, and Production Monitoring Evaluation of production traffic: distributed tracing, observability platforms, OpenTelemetry, online A/B testing, drift detection, and eval-as-product workflows.
- 45 Tools of the Trade: Eval & Production Stack Consolidated reference: platforms, libraries, datasets, models, and external resources for this part.
- 46 LLM-as-Judge & Automated Evaluation Judge reliability, debiasing techniques, training judge models, multi-judge ensembles, production patterns.
Part X · LLM Security & Runtime Safety
5 chapters · 22 sectionsAdversarial threats, guardrails, agent safety, privacy, security tooling.
- 47 Adversarial Security and Red Teaming As LLMs become embedded in high-stakes decisions, safety and ethics move from nice-to-have to regulatory requirements.
- 48 Guardrails and Runtime Safety Runtime content safety, output filtering, policy enforcement, and the difference between guardrails and alignment.
- 49 Agent Safety & Security Threat models, prompt injection defenses, sandboxed execution, agentic benchmarks, and supply-chain security.
- 50 Privacy and Data Protection Memorization, extraction attacks, differential privacy, federated learning, machine unlearning, and confidential inference.
- 51 Tools of the Trade: Safety & Guardrails Stack Consolidated reference: platforms, libraries, datasets, models, and external resources for this part.
Part XI · LLM Ethics, Trust & Governance
6 chapters · 25 sectionsBias and hallucination, provenance and transparency, regulation and compliance, frontier safety.
- 52 Bias, Fairness & Hallucinations Sources of bias, measurement, cross-cultural NLP, pluralistic alignment, and mitigation patterns.
- 53 Regulation, Compliance, and Governance EU AI Act, GDPR, NIST AI RMF, sector-specific regs, risk governance, and compliance-as-code.
- 54 Watermarking and Provenance Text and image watermarking, C2PA, synthetic-media detection, and the cat-and-mouse game.
- 54 Transparency and Disclosure Model cards, datasheets, system cards, audit trails, and explainability for high-stakes LLM decisions.
- 55 Environmental Impact & Green AI Carbon accounting, Green AI, CodeCarbon, tokens-per-joule, and training-time carbon optimization.
- 56 Responsible AI Tools of the Trade Tools of the trade reference.
Part XII · LLM Systems at Scale
5 chapters · 22 sectionsCompute planning, distributed training systems, hardware and chip diversity, edge and on-device LLMs.
- 57 Compute Planning & Infrastructure Sizing infrastructure for the workload you'll actually run.
- 58 Frontier Systems & Hardware Non-NVIDIA silicon, decentralized training, edge LLMs, training-inference co-design.
- 59 Distributed Training Systems Tools of the trade reference.
- 60 Edge & On-Device LLMs Production LLM systems engineering.
- 61 Scale Tools of the Trade Tools of the trade reference.
Part XIII · LLMOps & Lifecycle Management
5 chapters · 16 sectionsAI gateways and routing, workflow orchestration, containers, reliability and SLOs, model registry and lifecycle.
- 62 Production Engineering for LLM Systems Infrastructure-heavy engineering for LLM systems: scaling, AI gateways, workflow orchestration, edge deployment, reliability, and Kubernetes-native operations.
- 63 AI Gateways & Model Routing Production LLM systems engineering.
- 64 Workflow Orchestration & Durable Execution Production LLM systems engineering.
- 65 Containers, Kubernetes & Deployment Production LLM systems engineering.
- 66 Reliability, SLOs & Model Registry Production LLM systems engineering.
Part XIV · Applications of LLMs Across Industries
8 chapters · 45 sectionsLLM use across legal, finance, healthcare, education, cybersecurity, government, and other domains.
- 67 LLMs in Legal Practice Contract review, e-discovery, citation, and regulatory research. What works, what fails, and the bar-association rules that bind you.
- 68 LLMs in Finance Research synthesis, sentiment, code generation, compliance, customer operations. What's deployed, what's regulated, what blows up.
- 69 LLMs in Healthcare Ambient documentation, clinical decision support, patient-facing chat, drug discovery. Where LLMs help, where they hurt, and what FDA and HIPAA actually require.
- 70 LLMs in Education Tutoring, assessment, content generation, accessibility. Pedagogical evidence, integrity considerations, and what works in K-12 vs. higher ed.
- 71 LLMs in Cybersecurity SOC automation, code review, threat intel, defense and offense. What 2026 settled about LLMs in blue-team and red-team work.
- 72 LLMs in Government & Public Sector Constituent services, regulatory drafting, FOIA processing, benefits eligibility, fraud detection. Procurement, accountability, and the unique constraints of building AI for the public.
- 73 Manufacturing, Creative Industries, Search & Recommendation Maintenance copilots, BOM and ERP integration, supplier risk, shop-floor agents, predictive maintenance assistance. The IT/OT boundary, safety-critical constraints, and the realities of factory-floor deployment.
- 74 Tools of the Trade: Industry Solution Stack Consolidated reference: platforms, libraries, datasets, models, and external resources for this part.
Part XV · LLM & Agentic AI Research Frontiers
4 chapters · 18 sectionsFrontier architectures, theory and cognition, AGI trajectories, frontier research tooling.
- 75 Frontier Architectures & Scaling Emergent abilities, scaling frontiers, alternative architectures, and LLMs as universal sequence machines.
- 76 Frontier Theory & Cognition Formal theories of reasoning, memory primitives, mechanistic interpretability at scale, and the nature of agency.
- 77 AGI Trajectories & Open Questions Frontier benchmarks, timeline debate, alignment-at-frontier, economic implications.
- 78 Tools of the Trade: Frontier Research Stack Consolidated reference: platforms, libraries, datasets, models, and external resources for this part.
Appendices · Reference and Pedagogy
6 appendices- A Mathematical Foundations The essential linear algebra, probability, calculus, and information theory that power every transformer.
- B Course Syllabi Five tested course tracks (undergraduate engineering, undergraduate research, graduate engineering, graduate research, professional bootcamp) with week-by-week schedules.
- C Reading Pathways Per-audience reading guides for engineers, researchers, founders / PMs, and self-study learners.
- D Agents That Helped to Write This Book Roster of the 42 specialist AI agents in the writing pipeline that produced this manuscript, with a card per agent.
- E PyTorch Reference Standalone mini-book on PyTorch: tensors, autograd, nn.Module, data pipeline, training loops, mixed precision, distributed (FSDP), torch.compile, profiler, debugging recipes, deployment.
- G Signal Processing for Audio Sampling, framing, windows, DFT/FFT/STFT, mel scale and log-mel spectrograms, MFCC, and a Z-transform primer. The math prerequisites for Chapter 20.