Section Q.2: Built-in Modules: ChainOfThought, ReAct, and Retrieval | Building Conversational AI with LLMs and Agents

Big Picture

DSPy: Programmatic Prompt Optimization ships with a library of built-in modules that implement well-known prompting strategies: chain-of-thought reasoning, ReAct-style tool use, retrieval-augmented generation, and program-of-thought execution. These modules are drop-in replacements for dspy.Predict that add structured reasoning patterns without requiring you to manually engineer the prompts. This section covers each built-in module, when to use it, and how to customize its behavior.

1. dspy.Predict: The Foundation

Before exploring the advanced modules, recall that dspy.Predict is the simplest module. It takes a signature and generates outputs directly, with no intermediate reasoning steps. Think of it as a single LLM call with structured input and output.

import dspy

# Simple prediction: no reasoning, just input -> output
classify = dspy.Predict("email_text -> category: str")
result = classify(email_text="Your order has been shipped!")
print(result.category)  # "shipping notification"

Loaded 500 training examples, 100 validation examples Example: Q: What is photosynthesis? A: The process by which plants...

dspy.Predict is appropriate for simple tasks where the model can produce a correct answer in a single step: classification, extraction, formatting, and translation. For tasks that require reasoning, use one of the modules below.

2. dspy.ChainOfThought: Step-by-Step Reasoning

Chain-of-thought (CoT) prompting improves accuracy on complex tasks by asking the model to show its work before producing a final answer. DSPy: Programmatic Prompt Optimization's ChainOfThought module automates this: it adds a rationale field to the output and instructs the model to think step by step.

class MathProblem(dspy.Signature):
    """Solve the given math word problem."""
    problem: str = dspy.InputField()
    answer: float = dspy.OutputField()

# Without CoT: the model jumps directly to an answer
basic = dspy.Predict(MathProblem)

# With CoT: the model reasons before answering
cot = dspy.ChainOfThought(MathProblem)

result = cot(problem="A store sells apples at $2 each. If a customer buys "
             "3 apples and pays with a $10 bill, how much change do they get?")
print(result.rationale)  # "3 apples * $2 = $6. $10 - $6 = $4."
print(result.answer)     # 4.0

Evaluation metric: exact_match Baseline accuracy: 0.62 (100 examples)

The rationale field is automatically appended to the signature. You do not need to define it yourself. The generated prompt instructs the model to produce reasoning before the answer, which consistently improves performance on arithmetic, logic, and multi-step problems.

Key Insight

Chain-of-thought is not free. It increases output token usage (and therefore cost and latency) because the model generates the reasoning trace in addition to the answer. For simple tasks where accuracy is already high, dspy.Predict is more cost-effective. Reserve CoT for tasks where reasoning genuinely helps.

3. dspy.ChainOfThoughtWithHint

Sometimes the model needs a nudge in the right direction. ChainOfThoughtWithHint accepts an optional hint that is included in the prompt to guide the reasoning process.

cot_hint = dspy.ChainOfThoughtWithHint("question -> answer")

result = cot_hint(
    question="What is the time complexity of binary search?",
    hint="Think about how the search space is divided at each step.",
)
print(result.rationale)
# "At each step, binary search divides the search space in half.
#  Starting with n elements: n, n/2, n/4, ... 1.
#  The number of steps is log2(n)."
print(result.answer)  # "O(log n)"

BootstrapFewShot optimization: Bootstrapping demonstrations from training set... Selected 8 demonstrations Optimized accuracy: 0.78 (+0.16 improvement)

Hints are particularly useful during the development phase when you know the general approach but want the model to articulate it. After optimization (covered in Section Q.3), the optimizer may replace or remove hints as it discovers better prompting strategies.

4. dspy.ReAct: Reasoning with Tool Use

The ReAct (Reasoning and Acting) module implements an interleaved reasoning-action loop. The model reasons about what to do, executes a tool, observes the result, and repeats until it has enough information to answer.

# Define tools as Python functions
def search_wikipedia(query: str) -> str:
    """Search Wikipedia for a topic and return a summary."""
    # In practice, call the Wikipedia API
    return f"Wikipedia summary for: {query}"

def calculate(expression: str) -> str:
    """Evaluate a mathematical expression."""
    return str(eval(expression))  # Simplified; use a safe evaluator

# Create a ReAct agent with tools
react = dspy.ReAct(
    "question -> answer",
    tools=[search_wikipedia, calculate],
    max_iters=5,
)

result = react(
    question="What is the population of France divided by the area in km2?"
)
# The agent will:
# Thought: I need to find France's population and area
# Action: search_wikipedia("France population")
# Observation: France has ~68 million people
# Action: search_wikipedia("France area")
# Observation: France covers 551,695 km2
# Action: calculate("68000000 / 551695")
# Observation: 123.25
# Answer: ~123 people per km2
print(result.answer)

MIPROv2 optimization: Trial 1/20: accuracy=0.71 Trial 5/20: accuracy=0.76 Trial 10/20: accuracy=0.81 Trial 20/20: accuracy=0.84 Best accuracy: 0.84 (+0.22 improvement)

Tip

Keep tool functions simple and well-documented. DSPy: Programmatic Prompt Optimization uses the function name and docstring to tell the LLM what each tool does. A function named f with no docstring will confuse the model. Name it search_wikipedia and write a clear one-line description.

5. dspy.Retrieve: Vector-Based Retrieval

The Retrieve module integrates with retrieval models to fetch relevant passages from a corpus. It is the building block for RAG (Retrieval-Augmented Generation) pipelines in DSPy: Programmatic Prompt Optimization.

# Configure a retrieval model
colbert = dspy.ColBERTv2(url="http://localhost:8893/api/search")
dspy.configure(lm=lm, rm=colbert)

# Use Retrieve in a module
retrieve = dspy.Retrieve(k=5)  # Fetch top 5 passages
result = retrieve(query="How do transformers process sequences?")

for i, passage in enumerate(result.passages):
    print(f"[{i+1}] {passage[:100]}...")

Optimized program saved to: ./optimized_qa.json Loaded optimized program from: ./optimized_qa.json Verification accuracy: 0.84

DSPy: Programmatic Prompt Optimization supports multiple retrieval backends beyond ColBERTv2. You can use Pinecone, Weaviate, Qdrant, or any custom retriever that implements the retrieval model interface.

# Using Pinecone as the retrieval backend
import dspy
from dspy.retrieve.pinecone_rm import PineconeRM

retriever = PineconeRM(
    pinecone_index_name="my-docs",
    pinecone_api_key="...",
    k=5,
)
dspy.configure(lm=lm, rm=retriever)

# Now dspy.Retrieve() uses Pinecone under the hood

Compiled prompt (after optimization): System: Answer questions accurately and concisely. Demonstrations: 8 few-shot examples Instructions: Given the question, reason step by step...

6. dspy.ProgramOfThought: Code-Based Reasoning

For tasks that benefit from computation (math, data analysis, symbolic reasoning), ProgramOfThought asks the model to write and execute Python code rather than reasoning in natural language.

pot = dspy.ProgramOfThought("question -> answer")

result = pot(
    question="What is the sum of all prime numbers less than 50?"
)
# The model generates code:
# primes = [n for n in range(2, 50) if all(n%i != 0 for i in range(2, int(n**0.5)+1))]
# answer = sum(primes)
print(result.answer)  # 328

Comparison: Baseline (zero-shot): 0.62 BootstrapFewShot: 0.78 MIPROv2: 0.84 Best method: MIPROv2

ProgramOfThought is especially effective for quantitative tasks where chain-of-thought reasoning might introduce arithmetic errors. The model's code is executed in a sandboxed environment, and the result is returned as the answer.

7. dspy.MultiChainComparison

This module generates multiple chain-of-thought reasoning paths and then selects the best one. It is a form of self-consistency, where the model votes across multiple reasoning traces to find the most reliable answer.

multi = dspy.MultiChainComparison(
    "question -> answer",
    M=5,  # Generate 5 reasoning chains
)

result = multi(
    question="Is a whale a fish or a mammal? Explain why."
)
# Internally, DSPy: Programmatic Prompt Optimization generates 5 CoT traces, then asks the model
# to compare them and select/synthesize the best answer.
print(result.answer)
print(result.rationale)

A whale is a mammal, not a fish. Whales are warm-blooded, breathe air through lungs, give live birth, and nurse their young with milk. Rationale: Multiple reasoning paths converge on the biological classification. Whales belong to the order Cetacea within class Mammalia, sharing key mammalian traits despite their aquatic lifestyle.

Warning

MultiChainComparison multiplies your LLM costs by the number of chains (M). For M=5, you are making at least 6 LLM calls per invocation (5 chains plus 1 comparison). Use it selectively for high-stakes tasks where accuracy matters more than latency or cost.

8. Choosing the Right Module

The following guidelines help you select the appropriate built-in module for your task.

dspy.Predict: Simple classification, extraction, or formatting tasks. One LLM call, lowest cost.
dspy.ChainOfThought: Tasks requiring reasoning: math, logic, multi-step inference. Adds a rationale field.
dspy.ReAct: Tasks requiring external information or tool use. Implements a reasoning-action loop.
dspy.Retrieve: RAG pipelines where you need to fetch relevant documents before generating.
dspy.ProgramOfThought: Quantitative or computational tasks where code execution is more reliable than verbal reasoning.
dspy.MultiChainComparison: High-stakes tasks where you want self-consistency across multiple reasoning paths.

These modules can be freely composed. A RAG pipeline might use Retrieve for fetching documents, ChainOfThought for reasoning about them, and Predict for formatting the final output. The composition happens in your dspy.Module.forward method, exactly as shown in Section Q.1.