Agent Libraries: LangChain & Framework Deep Dive

Section 30.2

Agent libraries cluster into three styles: graph-based runtimes (LangGraph, AutoGen), role-based coordinators (CrewAI), and minimal "agent is a Python function" libraries (smolagents, PydanticAI). The choice maps to how much structure you want imposed.

30.2.1 Graph-based agent runtimes

30.2.2 Role-based coordinators

30.2.3 Minimal "agent as function" libraries

30.2.4 Comparing the libraries

Table 30.2.1: 30.2.1 Agent frameworks by style (2026).
Library Style Best for Tradeoff
LangGraph Graph state machine Complex workflows Steeper learning curve
AutoGen Multi-agent chat Agent-to-agent dialog Less fine control over state
CrewAI Role-based Quick multi-role demos Hides control flow
smolagents CodeAct minimal Code-writing agents Smaller ecosystem
PydanticAI Typed tools Structured-output agents Newer, fewer examples
Key Insight: Frameworks vs no framework

For a single-agent system with fewer than 10 tools, you do not need a framework. A while-loop, a function-calling-capable LLM, and a few hand-written tool definitions are sufficient. Frameworks pay off when you need persistence, human-in-the-loop checkpoints, multi-agent coordination, or production observability. Anthropic's "Building Effective Agents" essay makes this case in detail.

LangChain Agents (Legacy) and Callbacks

Big Picture

Agents are LLM-powered programs that can use tools to accomplish tasks. Instead of following a fixed chain of steps, an agent observes the current state, decides which tool to call (if any), interprets the result, and repeats until the task is complete. LangChain provides the primitives to define tools, wire them to a model, and run the agent loop. For production-grade agent workflows with branching, cycles, and human-in-the-loop controls, see the LangGraph documentation.

1. Defining Tools

A tool is a function that the LLM can choose to call. Each tool has a name, a description (which the model reads to decide when to use it), and a function that executes the action. LangChain provides the @tool decorator for creating tools from ordinary Python functions.

This example defines two custom tools: one that performs a calculation and one that looks up the current weather.

from langchain_core.tools import tool
from typing import Annotated

@tool
def calculate(expression: Annotated[str, "A mathematical expression to evaluate"]) -> str:
    """Evaluate a mathematical expression and return the result.
    Use this for any arithmetic or math questions."""
    try:
        result = eval(expression)  # In production, use a safe math parser
        return str(result)
    except Exception as e:
        return f"Error: {e}"

@tool
def get_weather(
    city: Annotated[str, "The city name"],
    unit: Annotated[str, "Temperature unit: celsius or fahrenheit"] = "celsius"
) -> str:
    """Get the current weather for a city.
    Use this when the user asks about weather conditions."""
    # In production, this would call a real weather API
    return f"The weather in {city} is 22 degrees {unit} and sunny."

# Inspect the tool's schema (this is what the model sees)
print(calculate.name)         # "calculate"
print(calculate.description)  # "Evaluate a mathematical expression..."
print(calculate.args_schema.schema())  # JSON schema for arguments
Output: calculate Evaluate a mathematical expression and return the result. Use this for any arithmetic or math questions. {'properties': {'expression': {'description': 'A mathematical expression to evaluate', 'type': 'string'}}, 'required': ['expression'], 'type': 'object'}
Code Fragment 30.2.12: Import from langchain_core.tools
Tip

Write clear, specific tool descriptions. The model uses these descriptions to decide which tool to call, so vague descriptions lead to incorrect tool selection. Include examples of when to use (and when not to use) each tool. The docstring becomes the tool's description; type annotations with Annotated become the parameter descriptions.

2. Built-in Tools

LangChain provides pre-built tool integrations for common operations. These save you from writing boilerplate for popular APIs and services.

from langchain_community.tools import DuckDuckGoSearchRun
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

# Web search via DuckDuckGo (no API key required)
search_tool = DuckDuckGoSearchRun()
result = search_tool.invoke("LangChain framework latest version 2025")
print(result[:200])

# Wikipedia lookup
wiki_tool = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper(top_k_results=1))
result = wiki_tool.invoke("Transformer neural network architecture")
print(result[:200])
Output: LangChain 0.3 was released in late 2025 with major improvements to the LCEL runtime, native async support, and a redesigned callback system. The framework now supports over 80 model providers... The transformer is a deep learning architecture introduced in the 2017 paper "Attention Is All You Need" by Vaswani et al. It relies on self-attention mechanisms to process input sequences in parallel...
Code Fragment 30.2.13: Web search via DuckDuckGo (no API key required)

3. Creating an Agent with create_tool_calling_agent

The create_tool_calling_agent function creates an agent that uses the model's native tool-calling capability (available in GPT-4o, Claude, Gemini, and others). The model decides which tools to call based on the user's input, and LangChain handles the execution loop via AgentExecutor.

This example creates an agent with access to the calculator and weather tools, then runs it on a multi-step question.

from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

# Define the prompt with a placeholder for agent scratchpad
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant with access to tools. "
               "Use them when needed to answer questions accurately."),
    MessagesPlaceholder(variable_name="chat_history", optional=True),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad")
])

# Initialize model and tools
model = ChatOpenAI(model="gpt-4o", temperature=0)
tools = [calculate, get_weather]

# Create the agent
agent = create_tool_calling_agent(model, tools, prompt)

# Wrap in AgentExecutor to handle the tool-call loop
executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,    # Print each step for debugging
    max_iterations=5  # Safety limit on tool-call loops
)

# Run the agent
result = executor.invoke({
    "input": "What is the weather in Paris? Also, what is 15% of 847?"
})
print(result["output"])
Output: > Entering new AgentExecutor chain... Invoking: `get_weather` with {'city': 'Paris', 'unit': 'celsius'} The weather in Paris is 22 degrees celsius and sunny. Invoking: `calculate` with {'expression': '0.15 * 847'} 127.05 The weather in Paris is currently 22 degrees Celsius and sunny. Also, 15% of 847 is 127.05.
Code Fragment 30.2.14: Define the prompt with a placeholder for agent scratchpad
Diagram
Figure 30.2.1: The agent execution loop. The LLM receives the user's input, decides whether to call a tool, observes the result, and repeats. When no more tool calls are needed, the LLM produces a final answer.

4. Callbacks and Tracing

Callbacks let you hook into every step of chain and agent execution: model calls, tool invocations, retriever queries, parsing, and errors. LangChain fires callback events at well-defined points in the execution lifecycle, and you can attach handlers to log, monitor, or modify behavior.

This example creates a custom callback handler that logs each step of agent execution.

from langchain_core.callbacks import BaseCallbackHandler
from typing import Any, Dict

class LoggingHandler(BaseCallbackHandler):
    """Logs every LLM call and tool invocation."""

    def on_llm_start(self, serialized: Dict[str, Any], prompts: list, **kwargs):
        print(f"[LLM START] Model: {serialized.get('id', ['unknown'])}")

    def on_llm_end(self, response, **kwargs):
        print(f"[LLM END] Tokens used: {response.llm_output}")

    def on_tool_start(self, serialized: Dict[str, Any], input_str: str, **kwargs):
        print(f"[TOOL START] {serialized.get('name', 'unknown')}: {input_str}")

    def on_tool_end(self, output: str, **kwargs):
        print(f"[TOOL END] Result: {output[:100]}")

    def on_agent_action(self, action, **kwargs):
        print(f"[AGENT ACTION] {action.tool}: {action.tool_input}")

    def on_agent_finish(self, finish, **kwargs):
        print(f"[AGENT FINISH] {finish.return_values}")

# Attach the handler to the executor
result = executor.invoke(
    {"input": "What is 2^10?"},
    config={"callbacks": [LoggingHandler()]}
)
Output: [LLM START] Model: ['openai', 'chat_models', 'ChatOpenAI'] [AGENT ACTION] calculate: {'expression': '2**10'} [TOOL START] calculate: {'expression': '2**10'} [TOOL END] Result: 1024 [LLM END] Tokens used: {'token_usage': {'prompt_tokens': 85, 'completion_tokens': 12}} [AGENT FINISH] {'output': '2^10 = 1024'}
Code Fragment 30.2.15: Import from langchain_core.callbacks

5. LangSmith Tracing

For production observability, LangChain integrates with LangSmith, a hosted tracing and evaluation platform. When enabled, every chain and agent execution is automatically traced with full input/output logging, latency breakdowns, token counts, and error tracking. No code changes are required beyond setting environment variables.

import os

# Enable LangSmith tracing (set these before importing LangChain)
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "ls__..."      # Your LangSmith API key
os.environ["LANGCHAIN_PROJECT"] = "my-agent-app"  # Project name in LangSmith

# All subsequent chain/agent invocations are automatically traced.
# View traces at https://smith.langchain.com

# You can also add custom metadata to traces
result = executor.invoke(
    {"input": "Summarize the weather in Tokyo"},
    config={
        "metadata": {"user_id": "user-456", "request_id": "req-789"},
        "tags": ["production", "weather-agent"]
    }
)
Code Fragment 30.2.16: Enable LangSmith tracing (set these before importing LangChain)
Tip

LangSmith tracing is invaluable during development and debugging. Enable it early in your project. The free tier is generous enough for development use. In production, use tags and metadata to filter traces by user, session, or feature, making it easy to investigate specific issues.

6. Custom Agent Loops

The AgentExecutor handles the most common agent pattern, but some applications need custom control flow: conditional tool selection, parallel tool calls, human approval before executing certain tools, or complex error recovery. For these cases, you can build a custom agent loop using LangChain's primitives directly.

This example implements a minimal agent loop that gives you full control over the execution cycle.

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage, ToolMessage

model = ChatOpenAI(model="gpt-4o", temperature=0)
model_with_tools = model.bind_tools([calculate, get_weather])

def run_agent(user_input: str, max_steps: int = 5) -> str:
    """A custom agent loop with full control over execution."""
    messages = [HumanMessage(content=user_input)]
    tools_map = {"calculate": calculate, "get_weather": get_weather}

    for step in range(max_steps):
        # Ask the model
        response = model_with_tools.invoke(messages)
        messages.append(response)

        # If no tool calls, we have the final answer
        if not response.tool_calls:
            return response.content

        # Execute each tool call
        for tool_call in response.tool_calls:
            tool_name = tool_call["name"]
            tool_args = tool_call["args"]

            print(f"Step {step + 1}: Calling {tool_name}({tool_args})")

            # Optional: add approval logic here
            # if tool_name == "dangerous_tool":
            #     approval = input("Approve? (y/n): ")
            #     if approval != "y":
            #         continue

            # Execute the tool
            tool_fn = tools_map[tool_name]
            result = tool_fn.invoke(tool_args)

            # Add the tool result as a ToolMessage
            messages.append(ToolMessage(
                content=str(result),
                tool_call_id=tool_call["id"]
            ))

    return "Agent reached maximum steps without a final answer."

# Run the custom agent
answer = run_agent("What is the weather in London and what is 42 * 17?")
print(answer)
Output: Step 1: Calling get_weather({'city': 'London', 'unit': 'celsius'}) Step 1: Calling calculate({'expression': '42 * 17'}) The weather in London is 22 degrees celsius and sunny. And 42 * 17 = 714.
Code Fragment 30.2.17: Import from langchain_openai
Warning
When to Use Custom Loops vs. AgentExecutor vs. LangGraph

Use AgentExecutor for straightforward tool-calling agents. Use a custom loop (as above) for simple cases where you need one specific customization such as approval gates. For anything involving branching logic, cycles, persistent state, or multi-agent coordination, use LangGraph, which provides a graph-based execution framework built for complex agent architectures.

7. Best Practices for Production Agents

Deploying agents in production requires careful attention to reliability, cost, and safety. The following guidelines apply to all agent implementations, whether you use AgentExecutor, a custom loop, or LangGraph.

Concern Recommendation
Runaway loops Set max_iterations (AgentExecutor) or a step counter (custom loop). A typical limit is 5 to 10 iterations.
Cost control Use callback handlers to track token usage per request. Set hard budget limits and abort if exceeded.
Tool safety Validate tool inputs before execution. Never pass user-controlled strings to eval() or shell commands without sanitization.
Error handling Set handle_parsing_errors=True on AgentExecutor. In custom loops, catch tool exceptions and return error messages to the model.
Observability Enable LangSmith tracing in all environments. Tag traces with user and session IDs for debugging.
Timeouts Set max_execution_time on AgentExecutor or use asyncio.wait_for() in async agents.
Figure 30.2.2: Production checklist for LangChain agents. Address each concern before deploying an agent to users.
Key Insight

LangChain agents combine an LLM's reasoning with tool execution in a loop. Use the @tool decorator to define tools with clear descriptions, create_tool_calling_agent for standard agent creation, and AgentExecutor to run the loop. Enable LangSmith tracing for observability, and set iteration limits and timeouts for safety. For complex agent architectures, graduate to LangGraph.

Agent Frameworks Deep Dive

Big Picture

Agent frameworks enable LLMs to take actions, use tools, plan multi-step workflows, and collaborate with other agents. The seven frameworks compared here represent three distinct architecture patterns: graph-based (LangGraph), role-based multi-agent (CrewAI, AutoGen), and code-first (OpenAI Agents SDK, Semantic Kernel, smolagents, PydanticAI). Each pattern optimizes for different trade-offs between control, simplicity, and multi-agent coordination.

1. Architecture Patterns

See Chapter 26 (AI Agents) and Chapter 28.2 (Architecture Patterns) for the agent-pattern theory. This appendix shows how each pattern looks in code across the seven frameworks.

1.1 Graph-Based Architecture (LangGraph)

See Chapter 26 for the graph-based agent pattern. The LangGraph implementation is:

The following example illustrates a LangGraph agent with a tool-calling loop and a human-in-the-loop approval step.

from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
from typing import TypedDict, Annotated
import operator

class AgentState(TypedDict):
    messages: Annotated[list, operator.add]

llm = ChatOpenAI(model="gpt-4o").bind_tools([search_tool, calculator_tool])

def call_model(state: AgentState):
    response = llm.invoke(state["messages"])
    return {"messages": [response]}

def should_continue(state: AgentState):
    last = state["messages"][-1]
    if last.tool_calls:
        return "tools"
    return END

graph = StateGraph(AgentState)
graph.add_node("agent", call_model)
graph.add_node("tools", ToolNode([search_tool, calculator_tool]))
graph.set_entry_point("agent")
graph.add_conditional_edges("agent", should_continue, {"tools": "tools", END: END})
graph.add_edge("tools", "agent")

app = graph.compile()
result = app.invoke({"messages": [HumanMessage("What is 15% of the GDP of France?")]})
Code Fragment 30.2.18: The following example illustrates a LangGraph agent with a tool-calling loop and a human-in-the-loop approval step.

1.2 Role-Based Multi-Agent Architecture (CrewAI, AutoGen)

See Chapter 28.2 for the role-based multi-agent pattern. The CrewAI implementation below shows a research team with two specialized agents collaborating on a report.

CrewAI versus AutoGen role-based architectures with task pipeline and group-chat topologies
Figure 30.2.f1: Two role-based multi-agent topologies. CrewAI compiles a declarative Crew of Agents and ordered Tasks into a sequential or hierarchical pipeline; AutoGen wires agents into a GroupChat whose manager picks the next speaker each turn. Both expose the same role / goal / tools abstraction; the topology decides how messages flow.

The choice between the two libraries is essentially a choice of message-flow topology. Let $A$ be the set of agents and let $\sigma_t \in A$ be the speaker at turn $t$. CrewAI fixes $\sigma_t$ by the static task graph (each Task names its agent), so the next speaker is a deterministic function $\sigma_{t+1} = \mathrm{next\_task}(\sigma_t)$. AutoGen instead lets a manager LLM pick the speaker from a conditional distribution over agents:

$$ \sigma_{t+1} \sim p_{\text{manager}}(\cdot \mid \text{history}_{1:t}, A) $$

The deterministic CrewAI flow is easier to reason about and to evaluate; the AutoGen flow is more adaptive but introduces a second decision (who speaks next) that can itself fail. In practice, teams pick CrewAI when the workflow is known upfront (research then write then review) and AutoGen when the conversation shape depends on intermediate results (the critic only speaks if the coder produced code).

from crewai import Agent, Task, Crew, LLM

llm = LLM(model="gpt-4o")

researcher = Agent(
    role="Senior Research Analyst",
    goal="Find comprehensive data on the given topic",
    backstory="You are an expert researcher with 20 years of experience.",
    tools=[search_tool, web_scraper],
    llm=llm,
)

writer = Agent(
    role="Technical Writer",
    goal="Create a clear, well-structured report from research findings",
    backstory="You specialize in making complex topics accessible.",
    llm=llm,
)

research_task = Task(
    description="Research the current state of LLM inference optimization.",
    expected_output="A detailed list of findings with sources.",
    agent=researcher,
)

writing_task = Task(
    description="Write a 1000-word report based on the research findings.",
    expected_output="A polished report in markdown format.",
    agent=writer,
)

crew = Crew(agents=[researcher, writer], tasks=[research_task, writing_task], verbose=True)
result = crew.kickoff()
Code Fragment 30.2.2: See Section 28.2 for the role-based multi-agent pattern.

The AutoGen counterpart replaces CrewAI's declarative Crew + ordered Tasks with a GroupChat wired through a GroupChatManager. The same four lines define a Coder and a User Proxy, hand them to a group chat, and let the manager pick the next speaker on every turn.

import autogen

coder = autogen.AssistantAgent(name="coder", llm_config={"model": "gpt-4o"})
user = autogen.UserProxyAgent(name="user", human_input_mode="NEVER",
                              code_execution_config={"work_dir": "out"})
chat = autogen.GroupChat(agents=[user, coder], messages=[], max_round=8)
manager = autogen.GroupChatManager(groupchat=chat,
                                   llm_config={"model": "gpt-4o"})
user.initiate_chat(manager, message="Plot prime gaps below 1000 with matplotlib.")

Code Fragment 30.2.2b: An AutoGen group chat with a Coder and a User Proxy. The GroupChatManager is itself an LLM-driven router that picks the next speaker on every turn, the topology drawn on the right side of Figure 30.2.f1.

Real-World Scenario: a CrewAI research-then-write run

Take the crew defined above and ask it to produce a 1000-word report on LLM inference optimization. A typical successful run looks like this turn-by-turn:

  1. Turn 1 (Researcher). The orchestrator activates research_task, binds it to the Researcher, and prompts the agent with its role + goal + backstory + task description. The agent calls search_tool("LLM inference optimization 2025"), gets 8 hits, then calls web_scraper twice on the most relevant URLs. After 3 tool calls and 4 reasoning steps it emits the expected output: a markdown bullet list of 12 findings with sources, totalling 780 tokens.
  2. Turn 2 (Writer). The orchestrator activates writing_task, binds it to the Writer, and prepends the Researcher's output as task context (CrewAI passes upstream task outputs into downstream task prompts automatically). The Writer has no tools, so it produces the report in one LLM call: 1024 tokens of polished markdown.
  3. Result. crew.kickoff() returns the Writer's output. Total: 2 agents, 2 tasks, 5 LLM calls (4 Researcher + 1 Writer), 3 tool calls, end-to-end 18 seconds with gpt-4o.

The same workflow in AutoGen would be a 4-agent group chat (User, Researcher, Writer, Manager) where the Manager picks the speaker each turn; the same task takes roughly twice as many LLM calls because the Manager adds an extra "who-speaks-next" turn between every content turn. This is the concrete cost of moving from deterministic CrewAI task flow to AutoGen's adaptive routing.

1.3 Code-First Architecture (OpenAI Agents SDK, smolagents, PydanticAI)

See Chapter 26 for the code-first agent loop pattern. The OpenAI Agents SDK implementation below shows a minimal agent definition.

from agents import Agent, Runner, function_tool

@function_tool
def search(query: str) -> str:
    """Search the web for information."""
    return web_search(query)

@function_tool
def calculate(expression: str) -> float:
    """Evaluate a mathematical expression."""
    return eval(expression)

agent = Agent(
    name="Research Assistant",
    instructions="You help users find information and perform calculations.",
    tools=[search, calculate],
    model="gpt-4o",
)

result = Runner.run_sync(agent, "What is 15% of the GDP of France?")
Code Fragment 30.2.3: See Chapter 26 for the code-first agent loop pattern.

2. Comprehensive Feature Comparison

The following table compares all seven agent frameworks across key dimensions relevant to production agent development.

Table 30.2.2: Agent Frameworks: Comprehensive Feature Comparison (as of 2026).
Feature LangGraph CrewAI AutoGen OpenAI Agents SDK Semantic Kernel smolagents PydanticAI
Architecture Graph/state machine Role-based crew Conversation-based Code-first loop Plugin/planner Code-first minimal Type-safe agents
Maintainer LangChain Inc. CrewAI Inc. Microsoft OpenAI Microsoft Hugging Face Pydantic team
Language Python, JS/TS Python Python, .NET Python Python, C#, Java Python Python
Multi-agent support Via sub-graphs Native (crews) Native (groups) Via handoffs Via planners Basic Manual composition
Tool calling Full (any provider) Full (decorator) Full (function map) Full (decorator) Full (plugins) Full (decorator) Full (Pydantic)
Human-in-the-loop Native (interrupt) Built-in Built-in Manual Approval hooks Manual Manual
State persistence Checkpointers Memory system Conversation store None built-in Memory stores None built-in None built-in
Streaming Full Event-based Full Full Full Basic Full
LLM provider lock-in None (any provider) None (litellm) None (configurable) OpenAI only None (connectors) None (any provider) None (any provider)
Observability LangSmith native CrewAI+ dashboard AutoGen Studio OpenAI dashboard OpenTelemetry Basic logging Logfire native
License MIT MIT MIT (CC-BY-4.0 docs) MIT MIT Apache 2.0 MIT
GitHub stars (approx.) 15k+ 25k+ 38k+ 15k+ 24k+ 15k+ 8k+

3. Multi-Agent Patterns

See Chapter 28.2 (Architecture Patterns) for the supervisor / collaborative / hierarchical taxonomy. The framework-specific implementations are referenced in the feature table above (Section 2): LangGraph uses conditional edges and nested sub-graphs; CrewAI exposes hierarchical and sequential processes; AutoGen relies on group chats and nested group chats; the OpenAI Agents SDK uses the handoff mechanism.

Key Insight

Multi-agent systems add complexity that is rarely justified for simple tasks. A single agent with multiple tools often outperforms a multi-agent system on straightforward workflows, with lower latency and easier debugging. Reserve multi-agent patterns for genuinely complex workflows where different steps require different expertise, different LLM configurations, or different trust boundaries. Start with a single agent and add agents only when you hit the limits of the single-agent approach.

4. Production Readiness

Agent frameworks vary widely in their production readiness. The following assessment focuses on features that matter when running agents in production environments with real users.

Table 30.2.3: Agent Frameworks: Production Readiness (as of 2026).
Production Feature LangGraph CrewAI AutoGen OpenAI Agents SDK Semantic Kernel
State recovery after failure Checkpointers (Redis, SQL) Memory persistence Conversation replay Manual Memory stores
Timeout and retry handling Built-in Built-in Configurable Built-in Built-in
Cost control (token budgets) Via callbacks Built-in budgets Token counting Via API settings Via filters
Guardrails integration Custom nodes Guardrails config Custom agents Native guardrails Filters
Deployment platform LangGraph Cloud CrewAI Enterprise AutoGen Studio OpenAI platform Azure AI
Long-running task support Native (async nodes) Background tasks Async groups Async runner Step-based

5. When to Use Each Framework

The following decision table provides concrete recommendations based on common project requirements and team characteristics.

Decision tree for choosing an agent framework
Figure 30.2.3: A four-question decision tree for picking an agent framework. The first split is architectural (single vs multi-agent); subsequent splits refine to a specific framework. Architecture fit comes before feature checklists. Q1 single agent vs multi-agent crew splits into two branches. Single-agent branch asks Q2 strict typing then Q2b open-source flexibility, leading to PydanticAI, smolagents, or OpenAI Agents SDK. Multi-agent branch asks Q3 state checkpoints, then Q3b code execution, then Q3c .NET stack, leading to LangGraph, AutoGen, Semantic Kernel, or CrewAI. A panel at the bottom explains the three underlying architecture patterns: graph-based, role-based, and code-first.
Table 30.2.4: Agent Framework Selection Guide (as of 2026).
If you need... Best Fit Runner-Up Rationale
Maximum control over agent flow LangGraph Semantic Kernel Graph-based architecture makes every state transition explicit
Quick multi-agent prototype CrewAI AutoGen Role-based definition is intuitive; minimal boilerplate
Enterprise .NET/Java ecosystem Semantic Kernel AutoGen (.NET) Microsoft backing; native C# and Java SDKs; Azure integration
OpenAI-only deployment OpenAI Agents SDK LangGraph Tightest integration with OpenAI models and platform
Minimal dependencies smolagents PydanticAI Lightest footprint; no heavy framework overhead
Type-safe structured outputs PydanticAI Semantic Kernel Built on Pydantic; native structured output validation
Research agent with code execution AutoGen CrewAI Built-in code executor; designed for code-writing agents
Production multi-agent with state LangGraph CrewAI Checkpointing and state recovery for long-running agents
Open-source model flexibility smolagents LangGraph Hugging Face ecosystem; works with any model on the Hub
Note

The agent framework landscape is evolving faster than any other LLM tooling category. New frameworks appear monthly, and existing frameworks add features rapidly. The architectural patterns (graph, role-based, code-first) are more stable than specific framework features. Choose based on architecture fit first, then evaluate features within your preferred architecture pattern.

6. Integration and Interoperability

Agent frameworks do not exist in isolation. They connect to orchestration layers (the Orchestration Frameworks overview), evaluation tools (Experiment Tracking), and serving infrastructure (vLLM & Inference Servers). Key integration points to evaluate include:

Summary

Agent frameworks divide into three architecture patterns: graph-based (LangGraph) for maximum control, role-based (CrewAI, AutoGen) for intuitive multi-agent collaboration, and code-first (OpenAI Agents SDK, smolagents, PydanticAI) for simplicity and transparency. Semantic Kernel bridges the enterprise world with multi-language support and Azure integration. For production systems requiring state persistence, failure recovery, and human-in-the-loop approval, LangGraph and Semantic Kernel offer the most mature feature sets. For rapid prototyping of multi-agent systems, CrewAI provides the fastest path. For minimal-dependency single-agent use cases, smolagents or PydanticAI keep your stack lean.

30.2.5 LangGraph Tutorial: From Chain to ReAct Agent

Big Picture

The earlier subsections compared LangGraph against alternative agent frameworks at the level of API style. This tutorial works through the framework end-to-end: typed state with reducers, a simple chain graph, the prebuilt tools_condition router, a full ReAct chat agent, persistent memory via Checkpointer and Thread, the two streaming modes, the human-in-the-loop interrupt pattern, and the LangGraph Studio visual debugger. The goal is to give a reader the working mental model needed to debug a LangGraph application, not just call it.

Core concepts: state, nodes, edges

A LangGraph application is a directed graph whose nodes are pure Python functions that receive the current state and return an update, and whose edges decide which node runs next. The graph carries a single typed State object that is threaded through every node call. Four node names are reserved: START (the implicit entry point), END (the implicit exit), ERROR (raised when a node throws), and TOOLS (the prebuilt tool-execution node). Every other node name is user-defined.

The simplest graph builds the state machine, registers two function nodes, and connects them with a single edge. The state is a TypedDict whose fields are mutated by each node:

from langgraph.graph import StateGraph, START, END
from typing import TypedDict

class CountState(TypedDict):
    value: int

def increment(state: CountState) -> dict:
    return {"value": state["value"] + 1}

def double(state: CountState) -> dict:
    return {"value": state["value"] * 2}

graph = StateGraph(CountState)
graph.add_node("inc", increment)
graph.add_node("dbl", double)
graph.add_edge(START, "inc")
graph.add_edge("inc", "dbl")
graph.add_edge("dbl", END)

app = graph.compile()
print(app.invoke({"value": 3}))   # {'value': 8}: (3+1)*2
Code Fragment 30.2.5: A two-node LangGraph chain. StateGraph takes the state type, nodes are added by name, and the chain runs in the order of add_edge(...) calls until it hits END.

Reducers: how nodes append to state

Each node's return value is normally merged field-by-field into the state, overwriting the previous value. For lists (messages, tool calls, observations), overwriting is the wrong default: a node that wants to append a new chat message would erase the entire history. LangGraph solves this with reducer functions declared in the state schema via typing.Annotated. The reducer takes the old field value and the node's update and returns the combined value. For message lists, the convention is to use operator.add (or LangGraph's add_messages, which deduplicates by ID):

from langgraph.graph.message import add_messages
from langchain_core.messages import AnyMessage
from typing import Annotated, TypedDict

class ChatState(TypedDict):
    # The reducer is the second tuple element. add_messages appends new
    # messages instead of overwriting the list and dedupes by message id.
    messages: Annotated[list[AnyMessage], add_messages]
Code Fragment 30.2.6: A chat-style state with a reducer. AnyMessage is the union of HumanMessage, AIMessage, and ToolMessage; the reducer makes every node's returned messages append rather than overwrite.

A chain graph: LLM-only chat

The minimal chat application wires a single LLM node into the chain. The node receives the message list, calls the model, and returns the new AIMessage; the reducer appends it to history. This is functionally identical to a hand-rolled chat loop, but expressed as a graph it composes cleanly with the additions that follow.

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

llm = ChatOpenAI(model="gpt-4o")

def chat_node(state: ChatState) -> dict:
    response = llm.invoke(state["messages"])
    return {"messages": [response]}   # the reducer appends, not replaces

graph = StateGraph(ChatState)
graph.add_node("chat", chat_node)
graph.add_edge(START, "chat")
graph.add_edge("chat", END)

app = graph.compile()
out = app.invoke({"messages": [HumanMessage("What is the boiling point of water?")]})
print(out["messages"][-1].content)
Output: The boiling point of water at standard atmospheric pressure (1 atm, or 101.325 kPa) is 100 degrees Celsius (212 degrees Fahrenheit). The exact temperature decreases with altitude because atmospheric pressure drops, which is why water boils at roughly 95 C in Denver and below 70 C at the top of Mount Everest.
Code Fragment 30.2.7: A pure-chat chain graph. One node, one edge, one reducer: the same idea will scale to a ReAct agent in two more steps.

Adding tools: ToolNode and tools_condition

Turning the chain into a ReAct agent requires three additions: bind tools to the LLM so it can emit tool_calls, add LangGraph's prebuilt ToolNode (registered under the reserved TOOLS name) to execute those calls and append the result, and add a conditional edge driven by the prebuilt tools_condition router. The router inspects the last message: if it contains tool calls it routes to tools; otherwise it routes to END. The tools node's output flows back into the chat node, closing the ReAct loop.

from langgraph.prebuilt import ToolNode, tools_condition
from langchain_core.tools import tool

@tool
def get_weather(city: str) -> str:
    """Return the current weather for a city."""
    return f"{city}: 18C, partly cloudy"

@tool
def calculator(expression: str) -> str:
    """Evaluate a simple arithmetic expression."""
    return str(eval(expression))   # toy: in production use a safe parser

tools = [get_weather, calculator]
llm_with_tools = ChatOpenAI(model="gpt-4o").bind_tools(tools)

def chat_node(state: ChatState) -> dict:
    return {"messages": [llm_with_tools.invoke(state["messages"])]}

graph = StateGraph(ChatState)
graph.add_node("chat", chat_node)
graph.add_node("tools", ToolNode(tools))          # the reserved TOOLS node
graph.add_edge(START, "chat")
graph.add_conditional_edges("chat", tools_condition)  # routes to "tools" or END
graph.add_edge("tools", "chat")                   # close the ReAct loop

app = graph.compile()
out = app.invoke({"messages": [HumanMessage(
    "What is the weather in Paris, and what is 23 * 17?")]})
for m in out["messages"]:
    print(type(m).__name__, getattr(m, "content", "")[:80])
Output: HumanMessage What is the weather in Paris, and what is 23 * 17? AIMessage ToolMessage Paris: 18C, partly cloudy ToolMessage 391 AIMessage The weather in Paris is 18 C and partly cloudy, and 23 * 17 = 391.
Code Fragment 30.2.8: A full ReAct chat agent in LangGraph. Two prebuilts (ToolNode, tools_condition) plus one conditional edge turn the chain graph into the ReAct loop. The model may emit only one tool call per turn; multiple tools are handled by repeated chat-tools cycles.

This is the canonical ReAct chat agent. Adding more tools is a one-line change (extend the tools list and rebind); adding planning, reflection, or memory is a matter of adding more nodes and wiring more edges, exactly as the planning and agentic-flow sections of Chapter 26 do.

Memory: Checkpointer and Thread

By default no state is preserved between invoke calls: each invocation starts from the input dict. Production agents need persistence for at least three reasons: multi-turn conversation, resume-after-crash, and human-in-the-loop interrupts that may pause for minutes or days. LangGraph models this with two abstractions. A Checkpoint is the state plus the execution-progress marker captured after every node returns. A Thread is the ordered collection of checkpoints belonging to one logical session, identified by a thread_id. Compiling the graph with a checkpointer (MemorySaver for tests, PostgresSaver or SqliteSaver in production) makes both first-class:

from langgraph.checkpoint.memory import MemorySaver

memory = MemorySaver()
app = graph.compile(checkpointer=memory)

# One thread (one conversation), two turns: the second sees the first.
config = {"configurable": {"thread_id": "session-42"}}
app.invoke({"messages": [HumanMessage("My name is Yossi.")]}, config=config)
out = app.invoke({"messages": [HumanMessage("What is my name?")]}, config=config)
print(out["messages"][-1].content)   # "Your name is Yossi."

# Inspect the thread: a list of checkpoints, newest first.
for ckpt in app.get_state_history(config):
    print(ckpt.config["configurable"]["checkpoint_id"], len(ckpt.values["messages"]))

# Resume from step 5: pick a past checkpoint and continue from it.
old_config = list(app.get_state_history(config))[4].config
app.invoke(None, config=old_config)   # replays from that checkpoint forward
Output: Your name is Yossi. 1f0d8c4a-... 4 1f0d8c49-... 3 1f0d8c48-... 2 1f0d8c47-... 1
Code Fragment 30.2.9: Checkpoints and threads. Every node return is saved under the thread; get_state_history lists past checkpoints and any one of them can be the starting config for a fresh invoke, which is the "time-travel debugging" feature.

Streaming modes: values vs updates

For UIs and tracing, app.stream(...) yields events as the graph runs instead of returning only the final state. The two most common modes differ in what they yield. stream_mode="values" emits the complete state after each reducer runs: every event is a full state dict, which is convenient when the UI re-renders from scratch but verbose. stream_mode="updates" emits only the per-node update dict, keyed by node name: events are small and tell you exactly which node produced the change, which is what tracing and observability backends want.

# Full state after every reducer (UI-friendly, verbose).
for event in app.stream({"messages": [HumanMessage("Weather in Tokyo?")]},
                        config=config, stream_mode="values"):
    print("STATE", len(event["messages"]), "msgs")

# Just the per-node update (logs / tracing-friendly, compact).
for event in app.stream({"messages": [HumanMessage("And in Lima?")]},
                        config=config, stream_mode="updates"):
    # event is e.g. {"chat": {"messages": [AIMessage(...)]}}
    node_name = next(iter(event))
    print("UPDATE from", node_name)
Output: STATE 4 msgs STATE 5 msgs STATE 6 msgs UPDATE from chat UPDATE from tools UPDATE from chat
Code Fragment 30.2.10: Two streaming modes for the same graph. values emits the full post-reducer state; updates emits only the per-node delta. Pick values for live-rendered UIs, updates for tracing and observability.

Human-in-the-loop: interrupt_before

For agents that take consequential actions (send email, run SQL writes, transfer funds), the right design pattern is to pause the graph at a known node, surface the proposed action to a human, and resume only after approval. LangGraph implements this by compiling the graph with interrupt_before=[...] (or interrupt_after). When execution reaches a listed node the graph saves a checkpoint, returns control to the caller, and waits. The caller inspects the pending state, optionally edits it with app.update_state(...), and resumes by calling invoke(None, config=...) on the same thread. The mechanism rides on top of checkpointing, which is why thread_id is required for HITL.

app = graph.compile(checkpointer=memory, interrupt_before=["tools"])
config = {"configurable": {"thread_id": "approval-1"}}

# Step 1: run until just before any tool call.
app.invoke({"messages": [HumanMessage(
    "Send a thank-you email to alice@example.com.")]}, config=config)
pending = app.get_state(config)
print("Proposed action:", pending.values["messages"][-1].tool_calls)

# Step 2 (human review): operator edits the args, then resumes.
edited = pending.values["messages"][-1]
edited.tool_calls[0]["args"]["body"] = "Thanks, Alice. From the LangGraph agent."
app.update_state(config, {"messages": [edited]})

# Step 3: resume from the saved checkpoint; the tool now executes.
final = app.invoke(None, config=config)
print(final["messages"][-1].content)
Output: Proposed action: [{'name': 'send_email', 'args': {'to': 'alice@example.com', 'body': 'Thank you, Alice.'}, 'id': 'call_xR9'}] I have sent the thank-you email to alice@example.com.
Code Fragment 30.2.11: Human-in-the-loop interrupt. interrupt_before=["tools"] pauses execution before the tools node runs; the operator inspects the pending tool call, edits arguments if needed via update_state, then resumes by invoking with None on the same thread.
See Also

The architectural rationale for HITL and the broader approval-flow design pattern live in Section 28.3; the code above is the LangGraph wire-level implementation.

LangGraph Studio: visual debugging

LangGraph Studio is the in-browser graph debugger that ships with the LangGraph platform. It renders the compiled graph as a live diagram, highlights the currently-executing node, exposes each checkpoint's state for inspection and edit, and lets the developer rewind to any past checkpoint and re-run forward from there. The same time-travel mechanism described in Code Fragment 30.2.9 is what Studio is showing graphically: every event the runtime emits already has a checkpoint ID, and Studio is the UI on top. For teams adopting LangGraph in production, Studio is the single best argument for using the framework rather than rolling a custom loop: a hand-rolled while-loop has no equivalent.

Key Insight: eight concepts cover LangGraph

Eight concepts cover essentially every LangGraph application: typed state, function nodes, conditional edges, reducers for append-style updates, the prebuilt ToolNode + tools_condition router that turns a chain into a ReAct agent, a checkpointer + thread for persistence and resume, two streaming modes (values for full state, updates for per-node deltas), and interrupts for human-in-the-loop pauses. Everything else (subgraphs, parallel branches, send/recv APIs) layers on top of these primitives.

Exercise 30.2.5.1: MemGPT-style memory agent in LangGraph Coding

Build a MemGPT-lite agent in LangGraph that adds two tools, load_memory(query) (vector search over a long-term store) and save_memory(text) (insert into the same store), and an agent node that prepends the top three recalled memories to every prompt before calling the LLM. Wire tools_condition so the agent can choose to load, save, or skip on each turn. Verify that across two threads sharing the same memory store, a fact saved in thread A becomes retrievable in thread B.

What's Next?

In the next part of this section, Section 30.3: Multi-Agent Patterns & Topologies, we move from single-agent libraries to the topologies that combine multiple agents into one system.

Further Reading
LangChain (2024). "LangGraph Documentation." The reference documentation for the graph-based agent framework discussed in this section, with worked examples of state machines, checkpointing, and human-in-the-loop interrupts.
CrewAI (2024). "CrewAI Documentation." Role-based multi-agent orchestration with declarative agent and task definitions. Includes the canonical "research crew" example.
Microsoft (2024). "AutoGen: Multi-Agent Conversation Framework." The Microsoft Research framework that popularized conversational multi-agent patterns. Covers GroupChat, nested chats, and code-execution agents.
Hugging Face (2024). "smolagents: Minimal Code-First Agents." The minimal-dependency code-first framework that emerged in late 2024 as a lightweight alternative to LangGraph and AutoGen.
LangChain (2024). "LangGraph Tutorials: Build a Basic Chatbot, Add Tools, Add Memory, Time Travel." The official tutorial series whose step-by-step structure (chain, then router, then ReAct, then checkpoints, then HITL, then Studio) mirrors Section 30.2.5; recommended companion for a hands-on first run.
LangChain (2024). "LangGraph Persistence: Checkpointers and Threads." Reference for the MemorySaver, SqliteSaver, and PostgresSaver backends; the thread / checkpoint abstractions that drive both Section 30.2.5's memory subsection and Section 26.6's resume test.