Agent frameworks enable LLMs to take actions, use tools, plan multi-step workflows, and collaborate with other agents. The seven frameworks compared here represent three distinct architecture patterns: graph-based (LangGraph), role-based multi-agent (CrewAI, AutoGen), and code-first (OpenAI Agents SDK, Semantic Kernel, smolagents, PydanticAI). Each pattern optimizes for different trade-offs between control, simplicity, and multi-agent coordination.
1. Architecture Patterns
Agent frameworks differ fundamentally in how they model agent behavior and control flow. Understanding the three dominant patterns helps you narrow the field before evaluating individual frameworks.
1.1 Graph-Based Architecture (LangGraph)
In a graph-based architecture, you define agent behavior as a state machine where nodes represent actions (LLM calls, tool use, human review) and edges represent transitions. The agent traverses the graph based on the current state and the LLM's decisions. This pattern provides maximum control over execution flow, supports cycles (the agent can loop back to previous steps), and makes complex workflows explicit and debuggable.
The following example illustrates a LangGraph agent with a tool-calling loop and a human-in-the-loop approval step.
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
from typing import TypedDict, Annotated
import operator
class AgentState(TypedDict):
messages: Annotated[list, operator.add]
llm = ChatOpenAI(model="gpt-4o").bind_tools([search_tool, calculator_tool])
def call_model(state: AgentState):
response = llm.invoke(state["messages"])
return {"messages": [response]}
def should_continue(state: AgentState):
last = state["messages"][-1]
if last.tool_calls:
return "tools"
return END
graph = StateGraph(AgentState)
graph.add_node("agent", call_model)
graph.add_node("tools", ToolNode([search_tool, calculator_tool]))
graph.set_entry_point("agent")
graph.add_conditional_edges("agent", should_continue, {"tools": "tools", END: END})
graph.add_edge("tools", "agent")
app = graph.compile()
result = app.invoke({"messages": [HumanMessage("What is 15% of the GDP of France?")]})
1.2 Role-Based Multi-Agent Architecture (CrewAI, AutoGen)
Role-based frameworks model agents as team members with defined roles, goals, and capabilities. Instead of programming control flow explicitly, you define agents and tasks, and the framework orchestrates their collaboration. This pattern excels at multi-agent scenarios where agents with different specializations need to cooperate.
The following CrewAI example shows a research team with two specialized agents collaborating on a report.
from crewai import Agent, Task, Crew, LLM
llm = LLM(model="gpt-4o")
researcher = Agent(
role="Senior Research Analyst",
goal="Find comprehensive data on the given topic",
backstory="You are an expert researcher with 20 years of experience.",
tools=[search_tool, web_scraper],
llm=llm,
)
writer = Agent(
role="Technical Writer",
goal="Create a clear, well-structured report from research findings",
backstory="You specialize in making complex topics accessible.",
llm=llm,
)
research_task = Task(
description="Research the current state of LLM inference optimization.",
expected_output="A detailed list of findings with sources.",
agent=researcher,
)
writing_task = Task(
description="Write a 1000-word report based on the research findings.",
expected_output="A polished report in markdown format.",
agent=writer,
)
crew = Crew(agents=[researcher, writer], tasks=[research_task, writing_task], verbose=True)
result = crew.kickoff()
1.3 Code-First Architecture (OpenAI Agents SDK, smolagents, PydanticAI)
Code-first frameworks provide minimal abstractions, giving you an agent loop with tool calling built on standard Python patterns. These frameworks prioritize simplicity and transparency over sophisticated orchestration features. They are ideal when you want full control over agent behavior without learning a framework-specific DSL.
The following example uses the OpenAI Agents SDK's straightforward agent definition.
from agents import Agent, Runner, function_tool
@function_tool
def search(query: str) -> str:
"""Search the web for information."""
return web_search(query)
@function_tool
def calculate(expression: str) -> float:
"""Evaluate a mathematical expression."""
return eval(expression)
agent = Agent(
name="Research Assistant",
instructions="You help users find information and perform calculations.",
tools=[search, calculate],
model="gpt-4o",
)
result = Runner.run_sync(agent, "What is 15% of the GDP of France?")
2. Comprehensive Feature Comparison
The following table compares all seven agent frameworks across key dimensions relevant to production agent development.
| Feature | LangGraph | CrewAI | AutoGen | OpenAI Agents SDK | Semantic Kernel | smolagents | PydanticAI |
|---|---|---|---|---|---|---|---|
| Architecture | Graph/state machine | Role-based crew | Conversation-based | Code-first loop | Plugin/planner | Code-first minimal | Type-safe agents |
| Maintainer | LangChain Inc. | CrewAI Inc. | Microsoft | OpenAI | Microsoft | Hugging Face | Pydantic team |
| Language | Python, JS/TS | Python | Python, .NET | Python | Python, C#, Java | Python | Python |
| Multi-agent support | Via sub-graphs | Native (crews) | Native (groups) | Via handoffs | Via planners | Basic | Manual composition |
| Tool calling | Full (any provider) | Full (decorator) | Full (function map) | Full (decorator) | Full (plugins) | Full (decorator) | Full (Pydantic) |
| Human-in-the-loop | Native (interrupt) | Built-in | Built-in | Manual | Approval hooks | Manual | Manual |
| State persistence | Checkpointers | Memory system | Conversation store | None built-in | Memory stores | None built-in | None built-in |
| Streaming | Full | Event-based | Full | Full | Full | Basic | Full |
| LLM provider lock-in | None (any provider) | None (litellm) | None (configurable) | OpenAI only | None (connectors) | None (any provider) | None (any provider) |
| Observability | LangSmith native | CrewAI+ dashboard | AutoGen Studio | OpenAI dashboard | OpenTelemetry | Basic logging | Logfire native |
| License | MIT | MIT | MIT (CC-BY-4.0 docs) | MIT | MIT | Apache 2.0 | MIT |
| GitHub stars (approx.) | 15k+ | 25k+ | 38k+ | 15k+ | 24k+ | 15k+ | 8k+ |
3. Multi-Agent Patterns
Multi-agent systems represent one of the fastest-growing areas in LLM development. The frameworks differ significantly in how they support agent-to-agent communication and coordination.
3.1 Supervisor Pattern
In the supervisor pattern, one agent coordinates others by deciding which sub-agent to invoke for each step. LangGraph implements this naturally through conditional edges, where a supervisor node routes to specialized agent sub-graphs. The OpenAI Agents SDK supports this through the handoff mechanism, where one agent explicitly transfers control to another.
3.2 Collaborative Pattern
In the collaborative pattern, agents communicate as peers without a central coordinator. CrewAI implements this through its crew abstraction, where agents pass task outputs to the next agent in a sequence or collaborate in parallel. AutoGen uses group chat, where agents take turns responding to a shared conversation thread.
3.3 Hierarchical Pattern
The hierarchical pattern organizes agents in a tree structure where higher-level agents delegate to lower-level specialists. CrewAI supports this with its hierarchical process mode. AutoGen supports it through nested group chats. LangGraph supports it through nested sub-graphs with parent-child state passing.
Multi-agent systems add complexity that is rarely justified for simple tasks. A single agent with multiple tools often outperforms a multi-agent system on straightforward workflows, with lower latency and easier debugging. Reserve multi-agent patterns for genuinely complex workflows where different steps require different expertise, different LLM configurations, or different trust boundaries. Start with a single agent and add agents only when you hit the limits of the single-agent approach.
4. Production Readiness
Agent frameworks vary widely in their production readiness. The following assessment focuses on features that matter when running agents in production environments with real users.
| Production Feature | LangGraph | CrewAI | AutoGen | OpenAI Agents SDK | Semantic Kernel |
|---|---|---|---|---|---|
| State recovery after failure | Checkpointers (Redis, SQL) | Memory persistence | Conversation replay | Manual | Memory stores |
| Timeout and retry handling | Built-in | Built-in | Configurable | Built-in | Built-in |
| Cost control (token budgets) | Via callbacks | Built-in budgets | Token counting | Via API settings | Via filters |
| Guardrails integration | Custom nodes | Guardrails config | Custom agents | Native guardrails | Filters |
| Deployment platform | LangGraph Cloud | CrewAI Enterprise | AutoGen Studio | OpenAI platform | Azure AI |
| Long-running task support | Native (async nodes) | Background tasks | Async groups | Async runner | Step-based |
5. When to Use Each Framework
The following decision table provides concrete recommendations based on common project requirements and team characteristics.
| If you need... | Best Fit | Runner-Up | Rationale |
|---|---|---|---|
| Maximum control over agent flow | LangGraph | Semantic Kernel | Graph-based architecture makes every state transition explicit |
| Quick multi-agent prototype | CrewAI | AutoGen | Role-based definition is intuitive; minimal boilerplate |
| Enterprise .NET/Java ecosystem | Semantic Kernel | AutoGen (.NET) | Microsoft backing; native C# and Java SDKs; Azure integration |
| OpenAI-only deployment | OpenAI Agents SDK | LangGraph | Tightest integration with OpenAI models and platform |
| Minimal dependencies | smolagents | PydanticAI | Lightest footprint; no heavy framework overhead |
| Type-safe structured outputs | PydanticAI | Semantic Kernel | Built on Pydantic; native structured output validation |
| Research agent with code execution | AutoGen | CrewAI | Built-in code executor; designed for code-writing agents |
| Production multi-agent with state | LangGraph | CrewAI | Checkpointing and state recovery for long-running agents |
| Open-source model flexibility | smolagents | LangGraph | Hugging Face ecosystem; works with any model on the Hub |
The agent framework landscape is evolving faster than any other LLM tooling category. New frameworks appear monthly, and existing frameworks add features rapidly. The architectural patterns (graph, role-based, code-first) are more stable than specific framework features. Choose based on architecture fit first, then evaluate features within your preferred architecture pattern.
6. Integration and Interoperability
Agent frameworks do not exist in isolation. They connect to orchestration layers (Section V.2), evaluation tools (Section V.4), and serving infrastructure (Section V.5). Key integration points to evaluate include:
- Tool protocol: LangGraph and CrewAI use different tool-calling interfaces. Ensure your tools can be shared across frameworks if you are evaluating multiple options.
- Observability hooks: LangGraph integrates natively with LangSmith. CrewAI has its own telemetry. Semantic Kernel uses OpenTelemetry. Verify that your chosen observability tool (Section V.4) can capture traces from your agent framework.
- Model Context Protocol (MCP): MCP is emerging as a standard protocol for connecting agents to external tools and data sources. LangGraph, OpenAI Agents SDK, and smolagents all support MCP clients, enabling agents to connect to any MCP-compliant tool server.
- Deployment model: Some frameworks (LangGraph Cloud, CrewAI Enterprise) offer managed deployment. Others require you to build your own deployment infrastructure. Match the deployment model to your operational capabilities.
Summary
Agent frameworks divide into three architecture patterns: graph-based (LangGraph) for maximum control, role-based (CrewAI, AutoGen) for intuitive multi-agent collaboration, and code-first (OpenAI Agents SDK, smolagents, PydanticAI) for simplicity and transparency. Semantic Kernel bridges the enterprise world with multi-language support and Azure integration. For production systems requiring state persistence, failure recovery, and human-in-the-loop approval, LangGraph and Semantic Kernel offer the most mature feature sets. For rapid prototyping of multi-agent systems, CrewAI provides the fastest path. For minimal-dependency single-agent use cases, smolagents or PydanticAI keep your stack lean.