Building Conversational AI with LLMs and Agents
Appendix V: LLM Tooling Ecosystem

Agent Frameworks: LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, Semantic Kernel, smolagents, PydanticAI

Big Picture

Agent frameworks enable LLMs to take actions, use tools, plan multi-step workflows, and collaborate with other agents. The seven frameworks compared here represent three distinct architecture patterns: graph-based (LangGraph), role-based multi-agent (CrewAI, AutoGen), and code-first (OpenAI Agents SDK, Semantic Kernel, smolagents, PydanticAI). Each pattern optimizes for different trade-offs between control, simplicity, and multi-agent coordination.

1. Architecture Patterns

Agent frameworks differ fundamentally in how they model agent behavior and control flow. Understanding the three dominant patterns helps you narrow the field before evaluating individual frameworks.

1.1 Graph-Based Architecture (LangGraph)

In a graph-based architecture, you define agent behavior as a state machine where nodes represent actions (LLM calls, tool use, human review) and edges represent transitions. The agent traverses the graph based on the current state and the LLM's decisions. This pattern provides maximum control over execution flow, supports cycles (the agent can loop back to previous steps), and makes complex workflows explicit and debuggable.

The following example illustrates a LangGraph agent with a tool-calling loop and a human-in-the-loop approval step.

from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
from typing import TypedDict, Annotated
import operator

class AgentState(TypedDict):
    messages: Annotated[list, operator.add]

llm = ChatOpenAI(model="gpt-4o").bind_tools([search_tool, calculator_tool])

def call_model(state: AgentState):
    response = llm.invoke(state["messages"])
    return {"messages": [response]}

def should_continue(state: AgentState):
    last = state["messages"][-1]
    if last.tool_calls:
        return "tools"
    return END

graph = StateGraph(AgentState)
graph.add_node("agent", call_model)
graph.add_node("tools", ToolNode([search_tool, calculator_tool]))
graph.set_entry_point("agent")
graph.add_conditional_edges("agent", should_continue, {"tools": "tools", END: END})
graph.add_edge("tools", "agent")

app = graph.compile()
result = app.invoke({"messages": [HumanMessage("What is 15% of the GDP of France?")]})

1.2 Role-Based Multi-Agent Architecture (CrewAI, AutoGen)

Role-based frameworks model agents as team members with defined roles, goals, and capabilities. Instead of programming control flow explicitly, you define agents and tasks, and the framework orchestrates their collaboration. This pattern excels at multi-agent scenarios where agents with different specializations need to cooperate.

The following CrewAI example shows a research team with two specialized agents collaborating on a report.

from crewai import Agent, Task, Crew, LLM

llm = LLM(model="gpt-4o")

researcher = Agent(
    role="Senior Research Analyst",
    goal="Find comprehensive data on the given topic",
    backstory="You are an expert researcher with 20 years of experience.",
    tools=[search_tool, web_scraper],
    llm=llm,
)

writer = Agent(
    role="Technical Writer",
    goal="Create a clear, well-structured report from research findings",
    backstory="You specialize in making complex topics accessible.",
    llm=llm,
)

research_task = Task(
    description="Research the current state of LLM inference optimization.",
    expected_output="A detailed list of findings with sources.",
    agent=researcher,
)

writing_task = Task(
    description="Write a 1000-word report based on the research findings.",
    expected_output="A polished report in markdown format.",
    agent=writer,
)

crew = Crew(agents=[researcher, writer], tasks=[research_task, writing_task], verbose=True)
result = crew.kickoff()

1.3 Code-First Architecture (OpenAI Agents SDK, smolagents, PydanticAI)

Code-first frameworks provide minimal abstractions, giving you an agent loop with tool calling built on standard Python patterns. These frameworks prioritize simplicity and transparency over sophisticated orchestration features. They are ideal when you want full control over agent behavior without learning a framework-specific DSL.

The following example uses the OpenAI Agents SDK's straightforward agent definition.

from agents import Agent, Runner, function_tool

@function_tool
def search(query: str) -> str:
    """Search the web for information."""
    return web_search(query)

@function_tool
def calculate(expression: str) -> float:
    """Evaluate a mathematical expression."""
    return eval(expression)

agent = Agent(
    name="Research Assistant",
    instructions="You help users find information and perform calculations.",
    tools=[search, calculate],
    model="gpt-4o",
)

result = Runner.run_sync(agent, "What is 15% of the GDP of France?")

2. Comprehensive Feature Comparison

The following table compares all seven agent frameworks across key dimensions relevant to production agent development.

Agent Frameworks: Comprehensive Feature Comparison
Feature LangGraph CrewAI AutoGen OpenAI Agents SDK Semantic Kernel smolagents PydanticAI
Architecture Graph/state machine Role-based crew Conversation-based Code-first loop Plugin/planner Code-first minimal Type-safe agents
Maintainer LangChain Inc. CrewAI Inc. Microsoft OpenAI Microsoft Hugging Face Pydantic team
Language Python, JS/TS Python Python, .NET Python Python, C#, Java Python Python
Multi-agent support Via sub-graphs Native (crews) Native (groups) Via handoffs Via planners Basic Manual composition
Tool calling Full (any provider) Full (decorator) Full (function map) Full (decorator) Full (plugins) Full (decorator) Full (Pydantic)
Human-in-the-loop Native (interrupt) Built-in Built-in Manual Approval hooks Manual Manual
State persistence Checkpointers Memory system Conversation store None built-in Memory stores None built-in None built-in
Streaming Full Event-based Full Full Full Basic Full
LLM provider lock-in None (any provider) None (litellm) None (configurable) OpenAI only None (connectors) None (any provider) None (any provider)
Observability LangSmith native CrewAI+ dashboard AutoGen Studio OpenAI dashboard OpenTelemetry Basic logging Logfire native
License MIT MIT MIT (CC-BY-4.0 docs) MIT MIT Apache 2.0 MIT
GitHub stars (approx.) 15k+ 25k+ 38k+ 15k+ 24k+ 15k+ 8k+

3. Multi-Agent Patterns

Multi-agent systems represent one of the fastest-growing areas in LLM development. The frameworks differ significantly in how they support agent-to-agent communication and coordination.

3.1 Supervisor Pattern

In the supervisor pattern, one agent coordinates others by deciding which sub-agent to invoke for each step. LangGraph implements this naturally through conditional edges, where a supervisor node routes to specialized agent sub-graphs. The OpenAI Agents SDK supports this through the handoff mechanism, where one agent explicitly transfers control to another.

3.2 Collaborative Pattern

In the collaborative pattern, agents communicate as peers without a central coordinator. CrewAI implements this through its crew abstraction, where agents pass task outputs to the next agent in a sequence or collaborate in parallel. AutoGen uses group chat, where agents take turns responding to a shared conversation thread.

3.3 Hierarchical Pattern

The hierarchical pattern organizes agents in a tree structure where higher-level agents delegate to lower-level specialists. CrewAI supports this with its hierarchical process mode. AutoGen supports it through nested group chats. LangGraph supports it through nested sub-graphs with parent-child state passing.

Key Insight

Multi-agent systems add complexity that is rarely justified for simple tasks. A single agent with multiple tools often outperforms a multi-agent system on straightforward workflows, with lower latency and easier debugging. Reserve multi-agent patterns for genuinely complex workflows where different steps require different expertise, different LLM configurations, or different trust boundaries. Start with a single agent and add agents only when you hit the limits of the single-agent approach.

4. Production Readiness

Agent frameworks vary widely in their production readiness. The following assessment focuses on features that matter when running agents in production environments with real users.

Agent Frameworks: Production Readiness
Production Feature LangGraph CrewAI AutoGen OpenAI Agents SDK Semantic Kernel
State recovery after failure Checkpointers (Redis, SQL) Memory persistence Conversation replay Manual Memory stores
Timeout and retry handling Built-in Built-in Configurable Built-in Built-in
Cost control (token budgets) Via callbacks Built-in budgets Token counting Via API settings Via filters
Guardrails integration Custom nodes Guardrails config Custom agents Native guardrails Filters
Deployment platform LangGraph Cloud CrewAI Enterprise AutoGen Studio OpenAI platform Azure AI
Long-running task support Native (async nodes) Background tasks Async groups Async runner Step-based

5. When to Use Each Framework

The following decision table provides concrete recommendations based on common project requirements and team characteristics.

Agent Framework Selection Guide
If you need... Best Fit Runner-Up Rationale
Maximum control over agent flow LangGraph Semantic Kernel Graph-based architecture makes every state transition explicit
Quick multi-agent prototype CrewAI AutoGen Role-based definition is intuitive; minimal boilerplate
Enterprise .NET/Java ecosystem Semantic Kernel AutoGen (.NET) Microsoft backing; native C# and Java SDKs; Azure integration
OpenAI-only deployment OpenAI Agents SDK LangGraph Tightest integration with OpenAI models and platform
Minimal dependencies smolagents PydanticAI Lightest footprint; no heavy framework overhead
Type-safe structured outputs PydanticAI Semantic Kernel Built on Pydantic; native structured output validation
Research agent with code execution AutoGen CrewAI Built-in code executor; designed for code-writing agents
Production multi-agent with state LangGraph CrewAI Checkpointing and state recovery for long-running agents
Open-source model flexibility smolagents LangGraph Hugging Face ecosystem; works with any model on the Hub
Note

The agent framework landscape is evolving faster than any other LLM tooling category. New frameworks appear monthly, and existing frameworks add features rapidly. The architectural patterns (graph, role-based, code-first) are more stable than specific framework features. Choose based on architecture fit first, then evaluate features within your preferred architecture pattern.

6. Integration and Interoperability

Agent frameworks do not exist in isolation. They connect to orchestration layers (Section V.2), evaluation tools (Section V.4), and serving infrastructure (Section V.5). Key integration points to evaluate include:

Summary

Agent frameworks divide into three architecture patterns: graph-based (LangGraph) for maximum control, role-based (CrewAI, AutoGen) for intuitive multi-agent collaboration, and code-first (OpenAI Agents SDK, smolagents, PydanticAI) for simplicity and transparency. Semantic Kernel bridges the enterprise world with multi-language support and Azure integration. For production systems requiring state persistence, failure recovery, and human-in-the-loop approval, LangGraph and Semantic Kernel offer the most mature feature sets. For rapid prototyping of multi-agent systems, CrewAI provides the fastest path. For minimal-dependency single-agent use cases, smolagents or PydanticAI keep your stack lean.