"An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators."
Agent X, Textbook-Quoting AI Agent
Chapter Overview
AI agents extend LLMs beyond single-turn question answering into autonomous problem solving. An agent perceives its environment, reasons about what to do next, takes actions through tools, and learns from the results. This perception-reasoning-action loop, formalized by the ReAct pattern, is the foundation of every agentic system.
This chapter covers the full agent foundation stack. It begins with the core agent paradigm, contrasting agents with chains and static workflows and introducing the four agentic design patterns (reflection, tool use, planning, and multi-agent collaboration). It then explores agent memory systems, including episodic, semantic, and procedural memory architectures like MemGPT/Letta and Mem0 (building on the vector database infrastructure from Chapter 19). The chapter covers planning strategies from simple plan-and-execute to tree search methods like LATS, examines reasoning models as agent backbones, and concludes with agent evaluation using benchmarks such as SWE-bench, GAIA, and WebArena.
AI agents represent a paradigm shift from reactive question-answering to proactive problem-solving. This chapter introduces the core agent loop: perceive, reason, plan, and act. The architectural patterns here form the foundation for tool use (Chapter 23), multi-agent systems (Chapter 24), and production agent deployment (Chapter 26).
Learning Objectives
- Explain the perception-reasoning-action loop (ReAct) and contrast agents with chains and static workflows
- Design agent memory systems using episodic, semantic, and procedural memory with architectures like MemGPT/Letta and Mem0
- Apply planning strategies including Tree of Thoughts, LATS, plan-and-execute, and reflection loops for complex multi-step tasks
- Select reasoning model backbones (o1/o3, Claude Extended Thinking, DeepSeek-R1) and configure thinking budgets for agent loops
- Evaluate agent performance using benchmarks such as SWE-bench, GAIA, WebArena, and OSWorld, and build custom evaluation harnesses
- Architect end-to-end agent systems with orchestration layers, state management, and observability integration
Prerequisites
- Chapter 10: LLM APIs (chat completions, message formatting, streaming)
- Chapter 11: Prompt Engineering (system prompts, chain-of-thought, structured outputs)
- Chapter 08: Reasoning & Test-Time Compute (reasoning models, thinking tokens)
- Familiarity with Python async programming and basic state machine concepts
Sections
- 22.1 The Agent Paradigm: From Chains to Autonomous Agents Perception-reasoning-action loop, ReAct, agents vs. chains vs. workflows, four agentic design patterns, state machines, cognitive architectures.
- 22.2 Agent Memory Systems MemGPT, Mem0, A-MEM Zettelkasten, Letta; episodic, semantic, and procedural memory; context window management and long-term persistence.
- 22.3 Planning & Agentic Reasoning Tree of Thoughts, LATS, MAP, plan-and-execute, reflection loops, self-correction strategies, and human-in-the-loop planning.
- 22.4 Reasoning Models as Agent Backbones o1/o3, Claude Extended Thinking, DeepSeek-R1, thinking budgets, when to use reasoning models vs. standard models in agent loops.
- 22.5 Agent Evaluation & Benchmarks SWE-bench, GAIA, AgentBench, WebArena, OSWorld, evaluation metrics, building custom agent benchmarks.
- 22.6 End-to-End Agent System Architecture A deployment blueprint covering agent system design, orchestration layers, state management, observability integration, and production deployment patterns.
- 22.7 Memory Architecture for Agents Memory taxonomy, short-term and long-term storage, retrieval policies, episodic and semantic memory, and memory management patterns for production agents.
- 22.8 Research Replication Benchmarks ML engineering agent evaluation, research replication tasks, benchmark design for measuring agent scientific reasoning and reproducibility.
What's Next?
In the next chapter, Chapter 23: Tool Use and Protocols, we dive into tool use and protocols like function calling and MCP that enable agents to take actions.
