Building Conversational AI with LLMs and Agents
Appendix Q: DSPy: Programmatic Prompt Optimization

Advanced Patterns: Assertions, Typed Predictors, and Multi-Hop

Big Picture

Once you have mastered DSPy: Programmatic Prompt Optimization's core modules and optimizers, advanced patterns unlock new capabilities: runtime constraints with dspy.Assert and dspy.Suggest, type-safe outputs with TypedPredictor, and multi-hop retrieval for complex knowledge-intensive tasks. These patterns handle the edge cases and quality requirements that separate prototypes from production systems.

1. dspy.Assert: Hard Constraints on Outputs

Assertions enforce invariants on LLM outputs at runtime. When an assertion fails, DSPy: Programmatic Prompt Optimization automatically retries the generation (with feedback about the failure) up to a configurable number of times. Think of them as runtime guardrails.

import dspy

class SafeAnswer(dspy.Module):
    def __init__(self):
        super().__init__()
        self.generate = dspy.ChainOfThought("question -> answer")

    def forward(self, question):
        result = self.generate(question=question)

        # Hard constraint: answer must not be empty
        dspy.Assert(
            len(result.answer.strip()) > 0,
            "The answer must not be empty.",
        )

        # Hard constraint: answer must be under 200 words
        dspy.Assert(
            len(result.answer.split()) < 200,
            "The answer must be under 200 words.",
        )

        return result

# If an assertion fails, DSPy: Programmatic Prompt Optimization retries with the error message
# included in the prompt, giving the LLM a chance to self-correct.
qa = SafeAnswer()
result = qa(question="Explain photosynthesis")

When an assertion fails, DSPy: Programmatic Prompt Optimization appends the failure message to the prompt and retries. This self-correction mechanism is remarkably effective: the LLM sees its previous output alongside the constraint it violated and typically produces a compliant response on the retry.

2. dspy.Suggest: Soft Constraints

While Assert is a hard constraint (failure triggers a retry), Suggest is a soft constraint (failure logs a warning but does not block execution). Use suggestions for quality preferences that are desirable but not critical.

class DetailedAnswer(dspy.Module):
    def __init__(self):
        super().__init__()
        self.generate = dspy.ChainOfThought("question -> answer")

    def forward(self, question):
        result = self.generate(question=question)

        # Hard: must have content
        dspy.Assert(
            len(result.answer.strip()) > 0,
            "Answer cannot be empty.",
        )

        # Soft: prefer detailed answers
        dspy.Suggest(
            len(result.answer.split()) > 20,
            "Answer should be at least 20 words for thoroughness.",
        )

        # Soft: prefer answers with examples
        dspy.Suggest(
            "for example" in result.answer.lower()
            or "such as" in result.answer.lower(),
            "Answer should include a concrete example.",
        )

        return result
Key Insight

Assertions and suggestions serve different roles during optimization and inference. During optimization, the optimizer uses assertion/suggestion feedback to select better few-shot examples and instructions. During inference, assertions enforce runtime safety while suggestions improve quality without blocking. This dual behavior is a key feature of DSPy: Programmatic Prompt Optimization's constraint system.

3. Activating Assertions

Assertions must be explicitly activated using a context manager or by wrapping your module. This design prevents assertions from interfering during optimization unless you want them to.

# Activate assertions with backtracking (retries on failure)
from dspy.primitives.assertions import assert_transform_module, backtrack_handler

# Wrap the module to enable assertion handling
qa_with_assertions = assert_transform_module(
    DetailedAnswer(),
    backtrack_handler,
    max_backtracks=3,  # Maximum retry attempts
)

# Now assertions are active
result = qa_with_assertions(question="What is machine learning?")

# Without activation, assertions are ignored (no retries)
qa_raw = DetailedAnswer()
result = qa_raw(question="What is machine learning?")  # No assertion enforcement

The max_backtracks parameter limits retries to prevent infinite loops. If the LLM cannot satisfy all assertions within the allowed attempts, the last response is returned along with a warning.

4. TypedPredictor: Structured Output with Pydantic

TypedPredictor uses Pydantic models to enforce structured output from LLM calls. Instead of parsing free-text responses, you define a schema and DSPy: Programmatic Prompt Optimization ensures the output conforms to it.

from pydantic import BaseModel, Field
from typing import Literal

class MovieReview(BaseModel):
    title: str = Field(description="Title of the movie")
    rating: int = Field(ge=1, le=10, description="Rating from 1 to 10")
    genre: Literal["action", "comedy", "drama", "horror", "sci-fi"] = Field(
        description="Primary genre"
    )
    pros: list[str] = Field(description="List of positive aspects")
    cons: list[str] = Field(description="List of negative aspects")

class ReviewAnalyzer(dspy.Signature):
    """Analyze a movie review and extract structured information."""
    review_text: str = dspy.InputField()
    analysis: MovieReview = dspy.OutputField()

analyzer = dspy.TypedPredictor(ReviewAnalyzer)
result = analyzer(
    review_text="Inception is a mind-bending thriller. The visual effects "
    "are stunning and the plot keeps you guessing. However, "
    "the emotional depth could be stronger."
)

# result.analysis is a validated MovieReview instance
print(result.analysis.title)   # "Inception"
print(result.analysis.rating)  # 8
print(result.analysis.genre)   # "sci-fi"
print(result.analysis.pros)    # ["Stunning visual effects", "Engaging plot"]
print(result.analysis.cons)    # ["Could have stronger emotional depth"]
Classification output: Input: 'I love this product, it works perfectly!' Label: positive Confidence: 0.96
Tip

Use Pydantic's validation features aggressively. Constraints like ge=1, le=10 for ratings, Literal for categorical fields, and min_length for lists are all enforced automatically. If the LLM's output violates a constraint, TypedPredictor retries with the validation error, similar to how assertions work.

5. TypedChainOfThought

Combining typed outputs with chain-of-thought reasoning gives you both structured results and interpretable reasoning traces.

class DiagnosticResult(BaseModel):
    diagnosis: str = Field(description="The identified issue")
    severity: Literal["low", "medium", "high", "critical"] = Field(
        description="Severity level"
    )
    recommended_actions: list[str] = Field(
        description="Ordered list of recommended actions"
    )

class SystemDiagnostic(dspy.Signature):
    """Diagnose a system issue from error logs."""
    error_log: str = dspy.InputField()
    result: DiagnosticResult = dspy.OutputField()

# TypedChainOfThought adds reasoning before the structured output
diagnostician = dspy.TypedChainOfThought(SystemDiagnostic)
result = diagnostician(
    error_log="ERROR 2026-04-04 OOM killed process PID 1234, RSS 32GB"
)

print(result.rationale)   # "The error shows an out-of-memory kill..."
print(result.result.severity)  # "high"
print(result.result.recommended_actions)
Summarization output: Input length: 2,450 words Summary: 'The paper introduces a novel attention mechanism that reduces computational complexity from O(n^2) to O(n log n)...' Summary length: 85 words

6. Multi-Hop Retrieval

Some questions cannot be answered from a single retrieval step. Multi-hop retrieval iteratively searches for information, using each retrieved passage to refine subsequent queries.

class MultiHopQA(dspy.Module):
    """Answer questions that require multiple retrieval steps."""

    def __init__(self, num_hops=3, passages_per_hop=3):
        super().__init__()
        self.num_hops = num_hops
        self.retrieve = dspy.Retrieve(k=passages_per_hop)
        self.generate_query = dspy.ChainOfThought(
            "context, question -> search_query"
        )
        self.generate_answer = dspy.ChainOfThought(
            "context, question -> answer"
        )

    def forward(self, question):
        context = []

        for hop in range(self.num_hops):
            # Generate a search query based on current context
            if context:
                query_result = self.generate_query(
                    context="\n".join(context),
                    question=question,
                )
                query = query_result.search_query
            else:
                query = question

            # Retrieve passages
            passages = self.retrieve(query=query).passages
            context.extend(passages)

        # Generate final answer from accumulated context
        return self.generate_answer(
            context="\n".join(context),
            question=question,
        )

# Example: a question requiring multiple hops
mhqa = MultiHopQA(num_hops=2)
result = mhqa(
    question="Who was the president of the US when the "
    "creator of Python was born?"
)
# Hop 1: "When was the creator of Python born?"
#   -> Guido van Rossum born Jan 31, 1956
# Hop 2: "Who was the US president in January 1956?"
#   -> Dwight D. Eisenhower
print(result.answer)  # "Dwight D. Eisenhower"
Entity extraction: Input: 'Apple CEO Tim Cook announced new products in Cupertino.' Entities: - Apple (ORGANIZATION) - Tim Cook (PERSON) - Cupertino (LOCATION)
Warning

Each hop adds an LLM call (for query generation) and a retrieval call. A 3-hop pipeline makes at least 7 calls per question (3 query generations, 3 retrievals, 1 answer generation). Set num_hops to the minimum needed for your task. Most questions need only 1 or 2 hops.

7. Combining Patterns: Production Pipeline

Advanced DSPy: Programmatic Prompt Optimization applications combine all these patterns: typed outputs for structured data, assertions for safety, multi-hop retrieval for knowledge, and optimization for quality.

class FactCheckResult(BaseModel):
    claim: str
    verdict: Literal["true", "false", "partially true", "unverifiable"]
    evidence: list[str] = Field(min_length=1)
    confidence: float = Field(ge=0.0, le=1.0)

class FactChecker(dspy.Module):
    def __init__(self):
        super().__init__()
        self.retrieve = dspy.Retrieve(k=5)
        self.check = dspy.TypedChainOfThought(
            dspy.Signature(
                "claim, evidence -> result: FactCheckResult",
                "Fact-check a claim against retrieved evidence.",
            )
        )

    def forward(self, claim):
        # Retrieve evidence
        passages = self.retrieve(query=claim).passages

        # Assert we found some evidence
        dspy.Assert(
            len(passages) > 0,
            "Must retrieve at least one passage as evidence.",
        )

        # Fact-check with typed output
        result = self.check(
            claim=claim,
            evidence="\n".join(passages),
        )

        # Suggest high confidence
        dspy.Suggest(
            result.result.confidence > 0.5,
            "Confidence should be above 0.5 for reliable verdicts.",
        )

        return result

# Optimize the fact-checker
from dspy.teleprompt import BootstrapFewShot
optimizer = BootstrapFewShot(metric=fact_check_metric)
compiled_checker = optimizer.compile(FactChecker(), trainset=trainset)
compiled_checker.save("fact_checker_v1.json")

8. When to Use Advanced Patterns

Each advanced pattern addresses a specific need. Use this guide to decide which patterns to adopt.

Start simple with dspy.Predict or ChainOfThought. Add assertions when you discover failure modes. Upgrade to TypedPredictor when you need structured output. Layer in multi-hop retrieval when your retrieval coverage is insufficient. Each pattern adds complexity, so adopt them incrementally based on measured need.