Section 27.4: Custom Tool Design: Validation, Error Handling & Security

"A tool without validation is just a fancy way to fail in production."
Pip, Defensively Programmed AI Agent

Big Picture

The quality of your tools determines the quality of your agent. A model decides whether and how to call a tool based entirely on its name, description, and parameter schema. Poorly designed tools cause incorrect calls, wasted tokens on retries, and security vulnerabilities. This section covers the principles of tool design (atomicity, defensive validation, informative errors), security hardening against prompt injection through tool inputs, and practical patterns for building tools that agents can use reliably in production. The function calling mechanics from Section 27.1 provide the interface layer; this section covers the implementation layer.

Prerequisites

This section builds on agent foundations from Chapter 26 and LLM API basics from Chapter 11.

A humorous scene showing one robot struggling with a tangled Swiss Army knife while a second robot happily uses a well-organized toolbox with clearly separated compartments — **Figure 27.4.1**: Good tool design versus bad. A monolithic multi-purpose tool confuses the agent, while atomic, well-labeled tools with clear boundaries make selection straightforward.

27.4.1 Principles of Tool Design

Fun Fact

Anthropic's internal style guide for tool descriptions reads more like fiction-writing advice than software docs: "give the model a reason to use this tool, not a permission slip." Teams who treat tool descriptions like JSDoc end up with agents that never call their best tools because the description sounds boring even to the model.

Two cartoon agent robots side by side. Left robot wrestles with a giant Swiss-army knife labelled manage_database with 14 blades unfolded in confusing directions. Right robot calmly picks one of four neatly labelled hammers create_record, get_record, update_record, delete_record from a compartmented toolbox — **Figure 27.4.2**: One multi-purpose tool gives the model a complicated interface to misuse. Four atomic tools each name themselves. The right-hand agent makes the same call in half the tokens, with half the retries.

A tool is only as good as its interface. The model decides whether and how to call your tool based entirely on its name, description, and parameter schema. Poorly designed tools lead to incorrect calls, wasted tokens on retries, and frustrated users. Well-designed tools guide the model toward correct usage through clear naming, comprehensive descriptions, strict input validation, and informative error messages.

The first principle is atomicity: each tool should do one thing well. A tool called manage_database that handles creates, reads, updates, and deletes through a single "action" parameter forces the model to understand a complex interface. Four separate tools (create_record, get_record, update_record, delete_record) are clearer and less error-prone. The model can select the right tool based on the user's intent without parsing a multi-purpose interface.

The second principle is defensive design: assume the model will provide invalid input and handle it gracefully. Validate all parameters against expected types, ranges, and formats. Return clear error messages that explain what went wrong and what the correct input should look like. The model can learn from error messages and self-correct, but only if the error message is informative. "Invalid input" is useless; "Expected 'date' in ISO 8601 format (YYYY-MM-DD), received '12/25/2025'" enables self-correction.

Key Insight

Include examples in your tool descriptions. Instead of just "A SQL query to execute", write "A read-only SQL query to execute against the analytics database. Example: SELECT customer_id, SUM(amount) FROM orders WHERE date > '2025-01-01' GROUP BY customer_id LIMIT 100." Models that see examples in the description produce more accurate tool calls because they have a concrete template to follow.

Input Validation Patterns

This snippet validates and sanitizes tool inputs before execution to prevent injection attacks and malformed data.

from pydantic import BaseModel, Field, field_validator
from typing import Optional
from datetime import date
class SearchOrdersInput(BaseModel):
    """Search customer orders with filters."""
    customer_id: str = Field(
        description="Customer ID in format 'CUST-XXXXX'",
        pattern=r"^CUST-\d{5}$",
        )
    start_date: Optional[date] = Field(
        default=None,
        description="Start date for the search range (ISO 8601: YYYY-MM-DD)",
        )
    end_date: Optional[date] = Field(
        default=None,
        description="End date for the search range (ISO 8601: YYYY-MM-DD)",
        )
    status: Optional[str] = Field(
        default=None,
        description="Order status filter",
        enum=["pending", "processing", "shipped", "delivered", "cancelled"],
        )
    limit: int = Field(
        default=20,
        ge=1,
        le=100,
        description="Maximum number of results to return (1 to 100, default 20)",
        )
    @validator("end_date")
    def end_after_start(cls, v, values):
        if v and values.get("start_date") and v < values["start_date"]:
            raise ValueError("end_date must be after start_date")
            return v
        def search_orders_tool(arguments: dict) -> str:
            """Execute the search with validated inputs."""
            try:
                params = SearchOrdersInput(**arguments)
            except Exception as e:
                return f"Validation error: {str(e)}. Please fix the input and try again."
                results = database.search_orders(
                    customer_id=params.customer_id,
                    start_date=params.start_date,
                    end_date=params.end_date,
                    status=params.status,
                    limit=params.limit,
                    )
                return format_results(results)

Code Fragment 27.4.1a: This snippet uses Pydantic models with Field validators to define structured tool input and output schemas, including parameter constraints like ge=0 for amounts and regex patterns for account IDs. The @validator decorators enforce business rules (e.g., transfer amount limits) before the tool function executes.

27.4.2 Error Handling and Recovery

Tools fail. APIs return 500 errors, databases time out, rate limits are hit, authentication tokens expire. How your tool handles these failures determines whether the agent can recover gracefully or gets stuck in an error loop. The key principle is: return actionable error information, not stack traces. The model needs to understand what happened and what it can do about it.

Implement a structured error response format that includes the error type (transient vs. permanent), a human-readable explanation, and suggested next steps. Transient errors (timeout, rate limit) should suggest waiting and retrying. Permanent errors (invalid API key, resource not found) should suggest alternative approaches. Include the original parameters in the error response so the model can modify and retry without re-deriving them from the conversation history.

Rate limiting requires special attention in Section 26.1 because agents can make many tool calls in rapid succession. Implement client-side rate limiting in your tool wrapper rather than relying on the agent to pace itself. When a rate limit is hit, return a clear message with the retry-after time rather than an opaque error. Some teams implement automatic backoff within the tool, sleeping and retrying transparently before returning an error to the model.

Think of tool error handling as a conversation with a junior developer. When something goes wrong, you would not dump a stack trace and walk away. You would say: "The database connection timed out after 5 seconds. This usually means the database is under heavy load. You could try again in a moment, or use the cached results from the last successful query." This is exactly what your tool should return. The model is your junior developer: smart enough to handle problems when given clear guidance, but unable to extract meaning from raw error dumps. Good error messages turn tool failures from dead ends into recoverable situations.

Warning

Never return raw API responses or stack traces as tool results. A 2,000-character stack trace consumes context window tokens without helping the model reason about the error. Wrap all tool implementations in error handlers that produce concise, structured error messages. Log the full error details server-side for debugging, and return only what the model needs to decide its next action.

27.4.3 Security Considerations

Tool access is the primary attack surface for agent systems. A prompt injection that tricks the model into calling delete_all_records or send_email with attacker-controlled content can cause real damage. Defense requires multiple layers: input validation (already covered), output filtering (check tool results before returning them to the model), permission scoping (tools should have minimum necessary access), and action classification (distinguish read-only operations from state-changing ones).

Implement a permission model for your tools. Read-only tools (search, lookup, describe) can be called freely. Write tools (create, update) should require confirmation for sensitive operations. Destructive tools (delete, revoke) should require explicit human approval in the agent loop. This graduated permission model limits the blast radius of prompt injection attacks without crippling the agent's ability to act. See Section 49.1 for a deeper treatment of agent safety.

Real-World Scenario: Permission Tiers for a DevOps Agent

Who: A site reliability engineering (SRE) team at an e-commerce platform managing 40 Kubernetes microservices.

Situation: The team built a DevOps agent that could diagnose and remediate production incidents. During a weekend traffic spike, the agent correctly identified a memory-starved pod and attempted to fix it by scaling the deployment, but it also scaled down a healthy service that shared a resource quota.

Problem: The agent had uniform permissions across all Kubernetes operations. A single overly broad tool definition let the agent perform read, scale, restart, and delete operations with no distinction between safe and dangerous actions.

Decision: The team implemented a four-tier permission model: Tier 1 (auto-approve) for read-only operations like checking logs and listing pods; Tier 2 (log and proceed) for low-risk writes like restarting individual pods and scaling up; Tier 3 (require human approval via PagerDuty) for scaling down, modifying network policies, and rolling back deployments; Tier 4 (never allow) for deleting namespaces, modifying RBAC, and direct database access.

Result: Over the next quarter, the agent autonomously resolved 62% of Tier 1 and Tier 2 incidents. Zero unauthorized destructive actions occurred, compared to two near-misses in the month before tiering was added.

Lesson: Graduated permission tiers let agents act quickly on safe operations while ensuring that dangerous actions always require human judgment.

Key Takeaways

Always validate tool inputs server-side; LLMs can hallucinate parameter values or produce adversarial inputs.
Good tool design for LLMs means descriptive names, clear parameter descriptions, constrained types, and informative errors.
Error messages should tell the LLM how to fix the call, not just report the failure.

Self-Check

Q1: Why is input validation important for custom tools, even when the LLM generates the arguments?

Show Answer

LLMs can hallucinate parameter values, exceed expected ranges, or produce inputs that cause injection attacks. Input validation acts as a safety net, preventing malformed or malicious arguments from reaching downstream systems regardless of the source.

Q2: What are the key principles of designing tools that are easy for LLMs to use correctly?

Show Answer

Tools should have descriptive names, clear parameter descriptions, constrained input types (enums over free text), sensible defaults, and informative error messages. The LLM sees only the schema and descriptions, so clarity in these fields directly determines tool use accuracy.

Exercises

Exercise 21.4.1: Tool Naming Conventions Conceptual

Why is it important that tool names be descriptive and follow consistent naming conventions? Give an example of a poorly named tool and its improved version, explaining why the improvement helps the LLM.

Answer Sketch

LLMs use tool names and descriptions to decide which tool to call. A tool named fn1 gives the model no information. Renamed to search_customer_orders with a clear description, the model can match user intents to tools reliably. Consistent naming (verb_noun pattern) also helps the model distinguish between similar tools like search_orders vs. create_order.

Exercise 21.4.2: Input Validation Layer Coding

Write a Python decorator @validate_tool_input that validates tool arguments against a JSON schema before execution. If validation fails, return a structured error message that helps the LLM correct its call.

Answer Sketch

Use the jsonschema library. The decorator receives the schema as a parameter, validates the incoming arguments, and on failure returns a message like: 'Validation error: field "email" must be a valid email address. You provided: "not-an-email". Please retry with a valid email.' This feedback loop helps the LLM self-correct.

Exercise 21.4.3: Graceful Error Messages Conceptual

Compare two approaches to returning tool errors to an LLM: (a) returning raw stack traces and (b) returning structured error messages with suggested fixes. Which approach leads to better agent behavior, and why?

Answer Sketch

Structured error messages are far better. Raw stack traces contain implementation details (file paths, line numbers) that are meaningless to the LLM and waste tokens. A structured message like 'Error: user_id 12345 not found. Suggestion: verify the user_id with the search_users tool before retrying' gives the model actionable information to recover. This reduces retry loops and improves task completion rates.

Exercise 21.4.4: SQL Injection Prevention Coding

An agent has a query_database tool. Write a safe implementation that uses parameterized queries and prevents the LLM from executing arbitrary SQL. Include an allowlist of permitted query patterns.

Answer Sketch

Define allowed query templates (e.g., 'SELECT {columns} FROM orders WHERE customer_id = ?'). The tool parses the LLM's intent, matches it to a template, extracts parameters, and executes the parameterized query. Reject any input that does not match an allowed template. Never pass raw LLM output directly to a SQL engine.

Exercise 21.4.5: Rate Limiting Tools Conceptual

Why should agent tools implement rate limiting? Describe a scenario where an agent without rate-limited tools could cause problems, and propose a solution.

Answer Sketch

An agent in a loop might call an external API hundreds of times if it misinterprets the task or gets stuck. Without rate limiting, this could exhaust API quotas, incur large costs, or trigger IP blocks. Solution: implement per-tool rate limits (e.g., max 10 calls per minute) with a token bucket algorithm. When the limit is reached, return a message telling the agent to wait or use cached results.

What Comes Next

In the next section, Retrieval as a Tool Call, we look at retrieval through the tool-use lens of this chapter: schema design, when the agent decides to retrieve, and structured error handling. The full agentic-RAG architecture lives in Section 32.3.

Further Reading

Schick, T., Dwivedi-Yu, J., Dessi, R., et al. (2023). "Toolformer: Language Models Can Teach Themselves to Use Tools." NeurIPS 2023. Demonstrates the importance of clean tool interfaces for LLM tool use, as models learn to call tools based on clear API signatures and documentation.

Patil, S.G., Zhang, T., Wang, X., et al. (2023). "Gorilla: Large Language Model Connected with Massive APIs." arXiv preprint. Shows how tool documentation quality directly impacts API call accuracy, motivating careful schema design with descriptive names and comprehensive parameter descriptions.

Anthropic (2024). "Tool Use (Function Calling)." Anthropic Documentation. Practical guide covering tool schema best practices, error handling patterns, and security considerations for production tool implementations.

Greshake, K., Abdelnabi, S., Mishra, S., et al. (2023). "Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection." AISec 2023. Demonstrates how tool outputs can contain adversarial content that hijacks agent behavior, motivating input sanitization and output validation in tool design.

OWASP (2024). "OWASP Top 10 for LLM Applications." Industry-standard security guidelines covering excessive agency, insecure output handling, and other vulnerabilities relevant to tool-using LLM systems.

Ruan, Y., Dong, H., Wang, A., et al. (2024). "Identifying the Risks of LM Agents with an LM-Emulated Sandbox." ICLR 2024. Proposes ToolEmu, a framework that emulates tool execution to identify safety risks in agent tool use before deployment, informing defensive tool design practices.