Once you have mastered DSPy: Programmatic Prompt Optimization's core modules and optimizers, advanced patterns unlock new capabilities: runtime constraints with dspy.Assert and dspy.Suggest, type-safe outputs with TypedPredictor, and multi-hop retrieval for complex knowledge-intensive tasks. These patterns handle the edge cases and quality requirements that separate prototypes from production systems.
1. dspy.Assert: Hard Constraints on Outputs
Assertions enforce invariants on LLM outputs at runtime. When an assertion fails, DSPy: Programmatic Prompt Optimization automatically retries the generation (with feedback about the failure) up to a configurable number of times. Think of them as runtime guardrails.
import dspy
class SafeAnswer(dspy.Module):
def __init__(self):
super().__init__()
self.generate = dspy.ChainOfThought("question -> answer")
def forward(self, question):
result = self.generate(question=question)
# Hard constraint: answer must not be empty
dspy.Assert(
len(result.answer.strip()) > 0,
"The answer must not be empty.",
)
# Hard constraint: answer must be under 200 words
dspy.Assert(
len(result.answer.split()) < 200,
"The answer must be under 200 words.",
)
return result
# If an assertion fails, DSPy: Programmatic Prompt Optimization retries with the error message
# included in the prompt, giving the LLM a chance to self-correct.
qa = SafeAnswer()
result = qa(question="Explain photosynthesis")
When an assertion fails, DSPy: Programmatic Prompt Optimization appends the failure message to the prompt and retries. This self-correction mechanism is remarkably effective: the LLM sees its previous output alongside the constraint it violated and typically produces a compliant response on the retry.
2. dspy.Suggest: Soft Constraints
While Assert is a hard constraint (failure triggers a retry), Suggest is a soft constraint (failure logs a warning but does not block execution). Use suggestions for quality preferences that are desirable but not critical.
class DetailedAnswer(dspy.Module):
def __init__(self):
super().__init__()
self.generate = dspy.ChainOfThought("question -> answer")
def forward(self, question):
result = self.generate(question=question)
# Hard: must have content
dspy.Assert(
len(result.answer.strip()) > 0,
"Answer cannot be empty.",
)
# Soft: prefer detailed answers
dspy.Suggest(
len(result.answer.split()) > 20,
"Answer should be at least 20 words for thoroughness.",
)
# Soft: prefer answers with examples
dspy.Suggest(
"for example" in result.answer.lower()
or "such as" in result.answer.lower(),
"Answer should include a concrete example.",
)
return result
Assertions and suggestions serve different roles during optimization and inference. During optimization, the optimizer uses assertion/suggestion feedback to select better few-shot examples and instructions. During inference, assertions enforce runtime safety while suggestions improve quality without blocking. This dual behavior is a key feature of DSPy: Programmatic Prompt Optimization's constraint system.
3. Activating Assertions
Assertions must be explicitly activated using a context manager or by wrapping your module. This design prevents assertions from interfering during optimization unless you want them to.
# Activate assertions with backtracking (retries on failure)
from dspy.primitives.assertions import assert_transform_module, backtrack_handler
# Wrap the module to enable assertion handling
qa_with_assertions = assert_transform_module(
DetailedAnswer(),
backtrack_handler,
max_backtracks=3, # Maximum retry attempts
)
# Now assertions are active
result = qa_with_assertions(question="What is machine learning?")
# Without activation, assertions are ignored (no retries)
qa_raw = DetailedAnswer()
result = qa_raw(question="What is machine learning?") # No assertion enforcement
The max_backtracks parameter limits retries to prevent infinite loops. If the LLM cannot satisfy all assertions within the allowed attempts, the last response is returned along with a warning.
4. TypedPredictor: Structured Output with Pydantic
TypedPredictor uses Pydantic models to enforce structured output from LLM calls. Instead of parsing free-text responses, you define a schema and DSPy: Programmatic Prompt Optimization ensures the output conforms to it.
from pydantic import BaseModel, Field
from typing import Literal
class MovieReview(BaseModel):
title: str = Field(description="Title of the movie")
rating: int = Field(ge=1, le=10, description="Rating from 1 to 10")
genre: Literal["action", "comedy", "drama", "horror", "sci-fi"] = Field(
description="Primary genre"
)
pros: list[str] = Field(description="List of positive aspects")
cons: list[str] = Field(description="List of negative aspects")
class ReviewAnalyzer(dspy.Signature):
"""Analyze a movie review and extract structured information."""
review_text: str = dspy.InputField()
analysis: MovieReview = dspy.OutputField()
analyzer = dspy.TypedPredictor(ReviewAnalyzer)
result = analyzer(
review_text="Inception is a mind-bending thriller. The visual effects "
"are stunning and the plot keeps you guessing. However, "
"the emotional depth could be stronger."
)
# result.analysis is a validated MovieReview instance
print(result.analysis.title) # "Inception"
print(result.analysis.rating) # 8
print(result.analysis.genre) # "sci-fi"
print(result.analysis.pros) # ["Stunning visual effects", "Engaging plot"]
print(result.analysis.cons) # ["Could have stronger emotional depth"]
Use Pydantic's validation features aggressively. Constraints like ge=1, le=10 for ratings, Literal for categorical fields, and min_length for lists are all enforced automatically. If the LLM's output violates a constraint, TypedPredictor retries with the validation error, similar to how assertions work.
5. TypedChainOfThought
Combining typed outputs with chain-of-thought reasoning gives you both structured results and interpretable reasoning traces.
class DiagnosticResult(BaseModel):
diagnosis: str = Field(description="The identified issue")
severity: Literal["low", "medium", "high", "critical"] = Field(
description="Severity level"
)
recommended_actions: list[str] = Field(
description="Ordered list of recommended actions"
)
class SystemDiagnostic(dspy.Signature):
"""Diagnose a system issue from error logs."""
error_log: str = dspy.InputField()
result: DiagnosticResult = dspy.OutputField()
# TypedChainOfThought adds reasoning before the structured output
diagnostician = dspy.TypedChainOfThought(SystemDiagnostic)
result = diagnostician(
error_log="ERROR 2026-04-04 OOM killed process PID 1234, RSS 32GB"
)
print(result.rationale) # "The error shows an out-of-memory kill..."
print(result.result.severity) # "high"
print(result.result.recommended_actions)
6. Multi-Hop Retrieval
Some questions cannot be answered from a single retrieval step. Multi-hop retrieval iteratively searches for information, using each retrieved passage to refine subsequent queries.
class MultiHopQA(dspy.Module):
"""Answer questions that require multiple retrieval steps."""
def __init__(self, num_hops=3, passages_per_hop=3):
super().__init__()
self.num_hops = num_hops
self.retrieve = dspy.Retrieve(k=passages_per_hop)
self.generate_query = dspy.ChainOfThought(
"context, question -> search_query"
)
self.generate_answer = dspy.ChainOfThought(
"context, question -> answer"
)
def forward(self, question):
context = []
for hop in range(self.num_hops):
# Generate a search query based on current context
if context:
query_result = self.generate_query(
context="\n".join(context),
question=question,
)
query = query_result.search_query
else:
query = question
# Retrieve passages
passages = self.retrieve(query=query).passages
context.extend(passages)
# Generate final answer from accumulated context
return self.generate_answer(
context="\n".join(context),
question=question,
)
# Example: a question requiring multiple hops
mhqa = MultiHopQA(num_hops=2)
result = mhqa(
question="Who was the president of the US when the "
"creator of Python was born?"
)
# Hop 1: "When was the creator of Python born?"
# -> Guido van Rossum born Jan 31, 1956
# Hop 2: "Who was the US president in January 1956?"
# -> Dwight D. Eisenhower
print(result.answer) # "Dwight D. Eisenhower"
Each hop adds an LLM call (for query generation) and a retrieval call. A 3-hop pipeline makes at least 7 calls per question (3 query generations, 3 retrievals, 1 answer generation). Set num_hops to the minimum needed for your task. Most questions need only 1 or 2 hops.
7. Combining Patterns: Production Pipeline
Advanced DSPy: Programmatic Prompt Optimization applications combine all these patterns: typed outputs for structured data, assertions for safety, multi-hop retrieval for knowledge, and optimization for quality.
class FactCheckResult(BaseModel):
claim: str
verdict: Literal["true", "false", "partially true", "unverifiable"]
evidence: list[str] = Field(min_length=1)
confidence: float = Field(ge=0.0, le=1.0)
class FactChecker(dspy.Module):
def __init__(self):
super().__init__()
self.retrieve = dspy.Retrieve(k=5)
self.check = dspy.TypedChainOfThought(
dspy.Signature(
"claim, evidence -> result: FactCheckResult",
"Fact-check a claim against retrieved evidence.",
)
)
def forward(self, claim):
# Retrieve evidence
passages = self.retrieve(query=claim).passages
# Assert we found some evidence
dspy.Assert(
len(passages) > 0,
"Must retrieve at least one passage as evidence.",
)
# Fact-check with typed output
result = self.check(
claim=claim,
evidence="\n".join(passages),
)
# Suggest high confidence
dspy.Suggest(
result.result.confidence > 0.5,
"Confidence should be above 0.5 for reliable verdicts.",
)
return result
# Optimize the fact-checker
from dspy.teleprompt import BootstrapFewShot
optimizer = BootstrapFewShot(metric=fact_check_metric)
compiled_checker = optimizer.compile(FactChecker(), trainset=trainset)
compiled_checker.save("fact_checker_v1.json")
8. When to Use Advanced Patterns
Each advanced pattern addresses a specific need. Use this guide to decide which patterns to adopt.
- Assertions: When outputs must satisfy hard business rules (format, length, content restrictions). Add them early; they catch errors before users see them.
- Suggestions: When you want to nudge quality without blocking execution. Useful for non-critical preferences like verbosity or style.
- TypedPredictor: When downstream code consumes the LLM output programmatically. Pydantic validation prevents parsing errors and schema violations.
- Multi-hop retrieval: When single-step retrieval cannot answer the question. Common in complex knowledge tasks, fact-checking, and research synthesis.
Start simple with dspy.Predict or ChainOfThought. Add assertions when you discover failure modes. Upgrade to TypedPredictor when you need structured output. Layer in multi-hop retrieval when your retrieval coverage is insufficient. Each pattern adds complexity, so adopt them incrementally based on measured need.