Chapter 30: Tools of the Trade: Agent Stack

Chapter opener illustration: Tools of the Trade: Agent Stack.

"An agent is a loop with permission to act. Most of the work is defining the loop. The rest is defining the permission."
Pip, Stack-Building AI Agent

Looking Back

Chapters 26 through 29 designed agents. This chapter surveys the frameworks that host them: LangGraph, OpenAI Agents SDK, Claude Code, Mastra, smolagents, and the open question of which abstraction layer is worth its complexity.

Big Picture

Part VI built agents: planning, tool use, multi-agent coordination, memory, and the protocols (MCP, A2A, AG-UI) that let agents talk to tools, to each other, and to users. This chapter is the toolbox: the orchestration frameworks (LangGraph, CrewAI, AutoGen, smolagents, PydanticAI), the new agent-native protocols, and the eval harnesses (SWE-bench, WebArena, SWE-Lancer) that measure whether an agent actually finishes the task.

Chapter Overview

Part VI introduced agents: the LLM-powered programs that can use tools, plan, and act with side effects. This chapter consolidates the agent toolchain: the agent-native platforms (Model Context Protocol, Agent Protocol, runtimes) that did not exist as standardized infrastructure before 2024, the libraries (LangChain, LangGraph, OpenAI Assistants, AutoGen, CrewAI, AG2), the benchmarks (AgentBench, SWE-Bench, GAIA, tau-bench, WebArena) that test end-to-end task completion, the models that hold up under agentic load, and the rapidly-evolving literature.

Agent tooling is the youngest stack in this book. By 2026 it has converged enough to ship products but is still moving fast enough to obsolete a year-old runbook. This chapter is the index of what stuck.

Note: Learning Objectives

Compare agent runtimes (LangGraph, AutoGen, CrewAI, AG2) on routing, memory, and human-in-the-loop support.
Configure an agent stack with Model Context Protocol and Agent Protocol for tool interoperability.
Choose an agent benchmark (AgentBench, SWE-Bench, GAIA, tau-bench, WebArena) for a target capability.
Identify the models (Claude, GPT, Gemini, open-weight) that hold up under multi-step tool-use load.
Track the agent literature through the venues, blogs, and communities that maintain it.

Library Shortcut

If you want one stack that scales from a single-agent toy to a multi-agent system:

pip install langgraph anthropic

LangGraph is the graph-based agent runtime from the LangChain team; combined with a frontier LLM it covers ~80 percent of the patterns in Part VI. For function-calling-heavy structured agents, swap in PydanticAI.

Sections in This Chapter

Prerequisites

Agent foundations from Chapter 26
LLM tooling stack from Chapter 14
Python and shell comfort for hands-on framework comparisons

What Comes Next

Next: Chapter 31: Embeddings, Vector Databases & Semantic Search, opening Part VII. The agents you just learned to build are limited by what is in their training data and what is in the prompt. Part VII fixes both problems: embedding-based retrieval gives the agent a queryable memory across millions of documents, and structured extraction lets it pull facts out of unstructured text. The shift is from "the model knows everything at training time" to "the system learns at runtime".