DSPy: Programmatic Prompt Optimization (Declarative Self-improving Python) reimagines how we program with LLMs. Instead of hand-crafting prompts, you declare what a module should do (its input/output signature) and let DSPy: Programmatic Prompt Optimization's optimizers figure out how to prompt the LLM effectively. The framework treats prompts as parameters to be learned, not strings to be engineered. This section introduces DSPy: Programmatic Prompt Optimization's core abstractions: signatures, modules, and the forward method that ties them together.
1. Installing and Configuring DSPy: Programmatic Prompt Optimization
DSPy: Programmatic Prompt Optimization is a Python-first framework distributed on PyPI. After installation, you configure a language model that serves as the backend for all DSPy: Programmatic Prompt Optimization operations.
# Install DSPy: Programmatic Prompt Optimization
pip install dspy
# Basic configuration
import dspy
# Configure the default language model
lm = dspy.LM("openai/gpt-4o-mini", api_key="sk-...")
dspy.configure(lm=lm)
# Or use a local model via Ollama
lm = dspy.LM("ollama_chat/llama3.2", api_base="http://localhost:11434")
dspy.configure(lm=lm)
The dspy.configure() call sets the default LM globally. Every DSPy: Programmatic Prompt Optimization module uses this LM unless explicitly overridden. You can reconfigure at any time, which makes it easy to swap models for experimentation.
2. Signatures: Declaring Input and Output
A signature is DSPy: Programmatic Prompt Optimization's way of declaring what a language model call should accomplish. It specifies the input fields and output fields, along with a natural language description of the task. DSPy: Programmatic Prompt Optimization uses this declaration to construct prompts automatically.
import dspy
# Inline signature (shorthand)
# "question -> answer" means: given a question, produce an answer
predict = dspy.Predict("question -> answer")
result = predict(question="What is the capital of France?")
print(result.answer) # "Paris"
# You can add descriptions with docstring-style syntax
predict = dspy.Predict("question: str -> answer: str")
The inline signature format uses arrows (->) to separate inputs from outputs. Each field name becomes both a prompt variable and an attribute on the result object.
3. Class-Based Signatures
For more control, define signatures as classes. Class-based signatures let you add descriptions, type hints, and constraints to each field.
class SentimentAnalysis(dspy.Signature):
"""Classify the sentiment of a product review."""
review: str = dspy.InputField(desc="A product review from a customer")
sentiment: str = dspy.OutputField(
desc="One of: positive, negative, neutral"
)
confidence: float = dspy.OutputField(
desc="Confidence score between 0.0 and 1.0"
)
# Use the signature with a predictor
classifier = dspy.Predict(SentimentAnalysis)
result = classifier(review="This laptop is amazing, best purchase ever!")
print(result.sentiment) # "positive"
print(result.confidence) # 0.95
The docstring on the signature class becomes the task description in the generated prompt. Write it carefully: DSPy: Programmatic Prompt Optimization uses it as the primary instruction to the LLM. A clear, specific docstring is more effective than a vague one, just as a clear function docstring helps human developers.
4. Building Modules with dspy.Module
A module in DSPy: Programmatic Prompt Optimization is a reusable component that encapsulates one or more LLM calls. Modules compose just like PyTorch's nn.Module: you define sub-modules in __init__ and wire them together in forward.
class QuestionAnswerer(dspy.Module):
"""Answers questions with a chain-of-thought reasoning step."""
def __init__(self):
super().__init__()
self.generate_answer = dspy.ChainOfThought(
"context, question -> answer"
)
def forward(self, context: str, question: str) -> dspy.Prediction:
return self.generate_answer(context=context, question=question)
# Use the module
qa = QuestionAnswerer()
result = qa(
context="Python was created by Guido van Rossum in 1991.",
question="Who created Python?",
)
print(result.answer) # "Guido van Rossum"
print(result.rationale) # The chain-of-thought reasoning
The forward method defines the module's execution logic. DSPy: Programmatic Prompt Optimization calls it when you invoke the module as a function. The method receives keyword arguments and returns a dspy.Prediction object containing all output fields.
5. Composing Modules
The real power of DSPy: Programmatic Prompt Optimization emerges when you compose modules into pipelines. Each module handles one step, and the pipeline wires them together.
class RAGPipeline(dspy.Module):
"""Retrieve relevant context, then answer the question."""
def __init__(self, num_passages=3):
super().__init__()
self.retrieve = dspy.Retrieve(k=num_passages)
self.answer = dspy.ChainOfThought("context, question -> answer")
def forward(self, question: str) -> dspy.Prediction:
# Step 1: Retrieve relevant passages
passages = self.retrieve(question).passages
# Step 2: Generate answer using retrieved context
context = "\n".join(passages)
return self.answer(context=context, question=question)
# Configure a retrieval model (e.g., ColBERTv2)
colbert = dspy.ColBERTv2(url="http://localhost:8893/api/search")
dspy.configure(lm=lm, rm=colbert)
rag = RAGPipeline(num_passages=5)
result = rag(question="What are the benefits of retrieval augmented generation?")
print(result.answer)
This composability is DSPy: Programmatic Prompt Optimization's key differentiator from prompt engineering. Instead of embedding retrieval logic inside a prompt template, you express it as modular, testable Python code. The LLM calls are abstracted behind signature declarations.
Keep modules small and focused. A module that retrieves, reasons, and formats in a single forward method is hard to optimize and debug. Split it into separate modules (Retriever, Reasoner, Formatter) and compose them. DSPy: Programmatic Prompt Optimization's optimizers work best when they can tune each module independently.
6. Predictions and Field Access
Every DSPy: Programmatic Prompt Optimization module returns a Prediction object. This is a dictionary-like container that provides attribute access to all output fields, plus metadata about the generation.
class Summarizer(dspy.Signature):
"""Summarize a document into key points."""
document: str = dspy.InputField()
summary: str = dspy.OutputField(desc="3-5 bullet point summary")
key_topics: list = dspy.OutputField(desc="List of main topics covered")
summarize = dspy.Predict(Summarizer)
result = summarize(document="Large language models are neural networks...")
# Access output fields
print(result.summary)
print(result.key_topics)
# Access the full completion for debugging
print(result.completions) # Raw LLM output
7. The Declarative Philosophy
DSPy: Programmatic Prompt Optimization inverts the traditional LLM programming model. In conventional prompt engineering, you specify how the model should behave (through carefully worded instructions). In DSPy: Programmatic Prompt Optimization, you specify what the model should produce (through signatures) and let the framework optimize the how automatically.
This shift has practical consequences. When you need better performance, you do not rewrite prompts. Instead, you provide training examples and run an optimizer (covered in Section Q.3). The optimizer adjusts the prompts, selects few-shot examples, and tunes the instructions, all without changing your module code.
DSPy: Programmatic Prompt Optimization's declarative approach requires a mindset shift. If you find yourself writing long prompt strings inside a DSPy: Programmatic Prompt Optimization module, you are fighting the framework. Trust the signatures and let the optimizers do the prompt engineering. Your job is to define clear input/output contracts and provide good training examples.
8. A Complete Example: Multi-Step Reasoning
The following example builds a module that breaks a complex question into sub-questions, answers each one, and synthesizes a final response.
class Decompose(dspy.Signature):
"""Break a complex question into simpler sub-questions."""
question: str = dspy.InputField()
sub_questions: list[str] = dspy.OutputField(
desc="2-4 simpler questions that help answer the main question"
)
class Synthesize(dspy.Signature):
"""Combine sub-answers into a comprehensive final answer."""
question: str = dspy.InputField()
sub_answers: str = dspy.InputField(desc="Answers to sub-questions")
final_answer: str = dspy.OutputField()
class MultiStepQA(dspy.Module):
def __init__(self):
super().__init__()
self.decompose = dspy.Predict(Decompose)
self.sub_answer = dspy.ChainOfThought("question -> answer")
self.synthesize = dspy.Predict(Synthesize)
def forward(self, question: str) -> dspy.Prediction:
# Decompose into sub-questions
decomp = self.decompose(question=question)
# Answer each sub-question
answers = []
for sq in decomp.sub_questions:
ans = self.sub_answer(question=sq)
answers.append(f"Q: {sq}\nA: {ans.answer}")
# Synthesize
combined = "\n\n".join(answers)
return self.synthesize(
question=question,
sub_answers=combined,
)
qa = MultiStepQA()
result = qa(question="How does climate change affect global food security?")
print(result.final_answer)
This module makes multiple LLM calls per invocation. In Section Q.3, you will learn how DSPy: Programmatic Prompt Optimization's optimizers can automatically find the best prompts and few-shot examples for each of these sub-modules, dramatically improving the quality of the final output.