LLMs generate free-form text by default, but applications need structured data: JSON objects, typed fields, lists, enums. LangChain: Chains, Agents, and Retrieval provides several mechanisms for extracting structured output from model responses, ranging from prompt-based parsers to native model features like tool calling. The modern recommended approach uses with_structured_output(), which leverages the model's built-in structured generation capabilities. This section covers both the legacy parsers and the modern approach.
1. Why Structured Output Matters
Consider a customer support bot that classifies incoming tickets. If the model returns "This seems like a billing issue, probably high priority," your code has to parse that free text to extract the category and priority. This is fragile and error-prone. Structured output lets you define an exact schema (category must be one of "billing", "technical", "general"; priority must be "low", "medium", or "high") and get back a validated object rather than a string to parse.
2. The Modern Approach: with_structured_output
The simplest and most reliable way to get structured output from a chat model is with_structured_output(). This method is available on all chat models that support tool calling (OpenAI, Anthropic, Google, Mistral, and others). You pass a Pydantic model or JSON schema, and the method returns a new runnable that outputs validated objects instead of raw text.
This example defines a Pydantic model for ticket classification and uses with_structured_output() to ensure the model's response conforms to the schema.
from pydantic import BaseModel, Field
from typing import Literal
from langchain_openai import ChatOpenAI
class TicketClassification(BaseModel):
"""Classification of a customer support ticket."""
category: Literal["billing", "technical", "general", "account"] = Field(
description="The primary category of the ticket"
)
priority: Literal["low", "medium", "high", "critical"] = Field(
description="The urgency level"
)
summary: str = Field(
description="A one-sentence summary of the issue"
)
requires_human: bool = Field(
description="Whether the ticket needs human escalation"
)
model = ChatOpenAI(model="gpt-4o", temperature=0)
structured_model = model.with_structured_output(TicketClassification)
ticket_text = """
I've been charged twice for my subscription this month.
The first charge was on the 1st and the second on the 15th.
I need a refund for the duplicate charge immediately.
"""
result = structured_model.invoke(
f"Classify this support ticket:\n\n{ticket_text}"
)
# result is a TicketClassification instance, not a string
print(f"Category: {result.category}") # "billing"
print(f"Priority: {result.priority}") # "high"
print(f"Summary: {result.summary}")
print(f"Needs human: {result.requires_human}") # True
print(f"Type: {type(result)}") # <class 'TicketClassification'>
Always add description fields to your Pydantic model attributes. These descriptions are sent to the model as part of the schema and significantly improve the quality of structured output. Think of them as instructions for each field.
Nested and Complex Schemas
Pydantic models can be nested to represent complex structures. The model will populate all levels of the hierarchy.
from pydantic import BaseModel, Field
from typing import List, Optional
class Entity(BaseModel):
"""A named entity extracted from text."""
name: str = Field(description="The entity name")
entity_type: str = Field(description="Type: person, organization, location, date")
context: str = Field(description="The sentence where this entity appears")
class DocumentAnalysis(BaseModel):
"""Complete analysis of a document."""
title: str = Field(description="A suitable title for the document")
language: str = Field(description="The primary language of the text")
entities: List[Entity] = Field(description="All named entities found")
key_topics: List[str] = Field(description="3 to 5 main topics")
sentiment: Literal["positive", "negative", "neutral", "mixed"] = Field(
description="Overall sentiment"
)
word_count_estimate: int = Field(description="Approximate word count")
structured_model = model.with_structured_output(DocumentAnalysis)
analysis = structured_model.invoke("Analyze this text: " + some_text)
# Access nested objects
for entity in analysis.entities:
print(f" {entity.name} ({entity.entity_type})")
3. Legacy Output Parsers
Before with_structured_output() existed, LangChain: Chains, Agents, and Retrieval used output parsers that instructed the model (via prompt engineering) to format its response as JSON, then parsed that JSON into Python objects. These parsers are still available and useful for models that do not support tool calling.
PydanticOutputParser
The PydanticOutputParser generates format instructions that are injected into the prompt, then validates the model's response against the schema.
from langchain_core.output_parsers import PydanticOutputParser
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field
from typing import List
class Recipe(BaseModel):
name: str = Field(description="Name of the recipe")
ingredients: List[str] = Field(description="List of ingredients")
prep_time_minutes: int = Field(description="Preparation time in minutes")
difficulty: str = Field(description="easy, medium, or hard")
parser = PydanticOutputParser(pydantic_object=Recipe)
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful cooking assistant.\n{format_instructions}"),
("human", "Give me a recipe for {dish}.")
])
# The parser generates instructions like:
# "The output should be formatted as a JSON instance..."
chain = prompt.partial(
format_instructions=parser.get_format_instructions()
) | model | parser
recipe = chain.invoke({"dish": "pasta carbonara"})
print(f"{recipe.name}: {recipe.prep_time_minutes} min, {recipe.difficulty}")
print(f"Ingredients: {', '.join(recipe.ingredients)}")
JsonOutputParser
When you do not need Pydantic validation and just want a Python dictionary, JsonOutputParser extracts JSON from the model's response.
from langchain_core.output_parsers import JsonOutputParser
json_parser = JsonOutputParser()
chain = (
ChatPromptTemplate.from_template(
"Return a JSON object with keys 'city', 'country', and 'population' "
"for: {query}\n{format_instructions}"
).partial(format_instructions=json_parser.get_format_instructions())
| model
| json_parser
)
result = chain.invoke({"query": "Tokyo"})
print(result) # {'city': 'Tokyo', 'country': 'Japan', 'population': 13960000}
print(type(result)) # <class 'dict'>
Prefer with_structured_output() for any model that supports tool calling (GPT-4o, Claude, Gemini). It is more reliable because it uses the model's native structured generation rather than hoping the model follows format instructions in the prompt. Use legacy parsers only for models that lack tool-calling support or when you need the format instructions to be visible in the prompt for debugging.
4. Output Fixing
Sometimes the model's output almost matches the expected format but has minor issues (a missing closing brace, an extra comma, a field with the wrong type). The OutputFixingParser wraps another parser and, when parsing fails, sends the malformed output back to the LLM with the error message, asking it to fix the formatting.
from langchain.output_parsers import OutputFixingParser
# Wrap the Pydantic parser with auto-fixing
fixing_parser = OutputFixingParser.from_llm(
parser=parser, # The PydanticOutputParser from above
llm=model
)
# Even if the model returns slightly malformed JSON, the fixer will retry
malformed = '{"name": "Carbonara", "ingredients": ["pasta", "eggs"], "prep_time_minutes": "thirty", "difficulty": "medium"}'
# "thirty" is a string but the schema expects int
# The fixing parser will ask the LLM to correct it
fixed = fixing_parser.parse(malformed)
print(f"Prep time: {fixed.prep_time_minutes}") # 30 (corrected to int)
5. Retry Parser
The RetryOutputParser goes further than the fixing parser: when parsing fails, it sends both the original prompt and the malformed output back to the model, giving it the full context to produce a correct response. This is useful when the model's output is fundamentally wrong rather than just syntactically broken.
from langchain.output_parsers import RetryOutputParser
retry_parser = RetryOutputParser.from_llm(
parser=parser,
llm=model,
max_retries=2 # Try up to 2 times before raising an error
)
# The retry parser needs access to the original prompt
# so it can re-ask the model with context
from langchain_core.prompt_values import StringPromptValue
prompt_value = StringPromptValue(text="Give me a recipe for pasta carbonara.")
completion = "Here is a great carbonara recipe..." # No JSON at all
try:
result = retry_parser.parse_with_prompt(completion, prompt_value)
print(result)
except Exception as e:
print(f"Failed after retries: {e}")
6. Streaming Structured Output
When using with_structured_output(), you can stream partial results as the model generates them. This is valuable for user-facing applications where you want to show structured data progressively. LangChain: Chains, Agents, and Retrieval yields partial Pydantic objects (or dictionaries) as each field becomes available.
from langchain_openai import ChatOpenAI
from pydantic import BaseModel, Field
from typing import List
class MovieReview(BaseModel):
title: str = Field(description="Movie title")
year: int = Field(description="Release year")
rating: float = Field(description="Rating from 0 to 10")
pros: List[str] = Field(description="What worked well")
cons: List[str] = Field(description="What could be improved")
verdict: str = Field(description="One sentence verdict")
model = ChatOpenAI(model="gpt-4o", temperature=0)
structured_model = model.with_structured_output(MovieReview)
# Stream partial objects
for partial in structured_model.stream("Review the movie Inception (2010)"):
print(partial)
# Early chunks: MovieReview(title='Inception', year=None, ...)
# Later chunks fill in more fields progressively
Use with_structured_output() with Pydantic models as your default approach for structured generation. Add field descriptions to guide the model. For resilience, consider wrapping legacy parsers with OutputFixingParser. Reserve RetryOutputParser for cases where you need full-context retries.