Part V: Retrieval and Conversation

Chapter 20: Retrieval-Augmented Generation (RAG)

"The best answer is not always inside the model. Sometimes the smartest thing an AI can do is look it up."

RAG RAG, Bookishly Wise AI Agent
Retrieval-Augmented Generation chapter illustration
Figure 20.0.1: RAG is the open-book exam of AI: instead of memorizing everything, the model looks up what it needs and weaves the answer on the fly.

Chapter Overview

Large language models are powerful generators but inherently limited by their training data cutoff, their tendency to hallucinate, and the impossibility of encoding all world knowledge in model parameters. Retrieval-Augmented Generation (RAG) addresses these limitations by connecting LLMs to external knowledge sources at inference time, grounding responses in retrieved evidence rather than relying solely on parametric memory. Building on the embedding and vector database foundations from Chapter 19, RAG closes the gap between static model knowledge and dynamic, real-world information.

This chapter covers the complete RAG landscape, from fundamental architectures through advanced retrieval techniques. You will learn how to build ingestion pipelines, implement query transformations, combine dense and sparse retrieval, and leverage knowledge graphs for structured reasoning. The chapter also explores agentic RAG systems that can decompose complex queries, perform iterative research, and synthesize information from multiple sources.

On the structured data side, you will learn how LLMs can query databases through text-to-SQL, process tabular data, and combine structured and unstructured retrieval. Finally, the chapter surveys the major RAG frameworks (LangChain, LlamaIndex, Haystack) that provide production-ready tooling for building retrieval-augmented applications.

Big Picture

Retrieval-augmented generation is one of the most widely deployed LLM patterns in production. By combining retrieval with generation, you can reduce hallucinations, keep responses current, and ground outputs in authoritative sources. This chapter is central to building the knowledge-intensive applications covered in Part VI and Part VIII.

Learning Objectives

Prerequisites

Sections

What's Next?

In the next chapter, Chapter 21: Conversational AI, we explore dialogue management, memory, and the patterns that make conversational AI systems effective.