Opening thesis
You will build a wrapper around the Anthropic API that validates every response against a JSON schema, automatically retries with the validation error appended to the prompt when validation fails, and surfaces a hard error after a bounded number of failed attempts. An agent with strict schema validation retries until it produces valid output, which a human reviewing free-form output would catch by reading every response and re-prompting manually. By the end, your downstream code can rely on the agent producing structured output that matches the contract every time.
Before
You ask the agent to extract structured data from a customer email: name, order number, issue category, urgency. The agent returns prose. You add a system prompt that says "return JSON." The agent returns JSON wrapped in markdown fences. You strip the fences. The agent returns JSON with a hallucinated extra field called "notes" that your downstream code does not expect. You add stricter prompts. The agent returns JSON with urgency: "high" instead of urgency: 5 because the prompt was unclear. You add examples. The agent occasionally returns valid JSON, occasionally returns malformed JSON, occasionally returns JSON with the right shape but wrong types. Your downstream code crashes one in five times. You add try-except blocks and fallback values. The fallbacks corrupt your data pipeline silently. The root problem: you are asking the agent for structure but accepting whatever it produces.
Architecture
The system has three components: a JSON schema definition for the expected output, a validator that checks responses against the schema, and a retry loop that re-prompts the agent with the schema and the validation error if the response is invalid. Validation happens before the response leaves the wrapper. Downstream code only ever sees output that has passed validation.
- Caller invokes Validated agent wrapper with query and JSON schema
- Validated agent wrapper builds a prompt that includes the schema and instructs JSON-only output
- Anthropic API returns a response
- JSON parser attempts to parse the response
- If parse fails: Retry composer builds a new prompt with the parse error
- If parse succeeds: Schema validator checks the structure
- If validation fails: Retry composer builds a new prompt with the validation error
- If validation succeeds: Validated output returns to Caller
- After N failed retries: Hard failure raises an error
Step-by-step implementation
Step 1: Install dependencies
You need the Anthropic SDK and the jsonschema library for validation.
pip install anthropic jsonschema
export ANTHROPIC_API_KEY="sk-ant-..."
Step 2: Define an example schema
Schemas use the standard JSON Schema format. The schema below describes a customer issue extraction task with four required fields and explicit types and enums.
# schemas.py
CUSTOMER_ISSUE_SCHEMA = {
"type": "object",
"properties": {
"customer_name": {"type": "string"},
"order_number": {"type": "string", "pattern": "^ORD-[0-9]{6}$"},
"category": {
"type": "string",
"enum": ["billing", "shipping", "product", "account", "other"],
},
"urgency": {"type": "integer", "minimum": 1, "maximum": 5},
},
"required": ["customer_name", "order_number", "category", "urgency"],
"additionalProperties": False,
}
The additionalProperties: false clause is the constraint that catches hallucinated extra fields.
Step 3: Build the prompt composer
The composer takes a query and a schema and constructs a system prompt that includes the schema and explicit formatting rules. The schema embedded in the prompt acts as both contract and example.
# prompt_composer.py
import json
def build_system_prompt(schema: dict) -> str:
return f"""You produce structured JSON output matching the schema below. No prose, no markdown fences, no commentary.
Schema:
{json.dumps(schema, indent=2)}
Rules:
- Return ONLY a single JSON object that validates against the schema.
- Do not add fields that are not in the schema.
- Do not omit required fields.
- Match the types exactly. If the schema says "integer", return a number, not a string.
- For enum fields, use only values from the enum list."""
Step 4: Build the JSON parser with cleanup
LLMs sometimes wrap JSON in markdown fences despite instructions. The parser strips fences before parsing, then attempts to extract a JSON object even if there is leading or trailing whitespace.
# json_parser.py
import json
import re
def parse_response(text: str):
cleaned = text.strip()
# Strip markdown fences if present
if cleaned.startswith("```"):
cleaned = re.sub(r"^```(?:json)?\s*", "", cleaned)
cleaned = re.sub(r"\s*```$", "", cleaned)
# If there is leading/trailing prose, try to extract the JSON object
match = re.search(r"\{.*\}", cleaned, re.DOTALL)
if match:
cleaned = match.group(0)
return json.loads(cleaned)
Step 5: Build the schema validator
The validator uses the jsonschema library to check parsed JSON against the schema. If validation fails, it returns the error message in a form the retry loop can use as feedback.
# schema_validator.py
from jsonschema import validate, ValidationError
class ValidationFailed(Exception):
def __init__(self, message: str, path: str):
super().__init__(message)
self.message = message
self.path = path
def validate_against_schema(data, schema: dict) -> None:
try:
validate(instance=data, schema=schema)
except ValidationError as e:
path = ".".join(str(p) for p in e.path) if e.path else "root"
raise ValidationFailed(message=e.message, path=path)
Step 6: Build the validated agent wrapper
The wrapper combines the prompt composer, the JSON parser, the schema validator, and a retry loop. On parse or validation failure, it composes a new prompt that includes the original query, the previous response, and the specific error, then retries up to a maximum number of attempts.
# validated_agent.py
import json
import anthropic
from prompt_composer import build_system_prompt
from json_parser import parse_response
from schema_validator import validate_against_schema, ValidationFailed
client = anthropic.Anthropic()
class AgentValidationError(Exception):
pass
def query_with_schema(query: str, schema: dict, max_attempts: int = 3) -> dict:
system = build_system_prompt(schema)
user_message = query
last_response_text = None
last_error = None
for attempt in range(1, max_attempts + 1):
if attempt > 1 and last_response_text is not None:
user_message = (
f"{query}\n\n"
f"Your previous attempt:\n{last_response_text}\n\n"
f"Validation error: {last_error}\n\n"
f"Return a corrected JSON object that matches the schema."
)
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system=system,
messages=[{"role": "user", "content": user_message}],
)
text = response.content[0].text
last_response_text = text
try:
parsed = parse_response(text)
except json.JSONDecodeError as e:
last_error = f"JSON parse failure: {e}"
continue
try:
validate_against_schema(parsed, schema)
return parsed
except ValidationFailed as e:
last_error = f"Schema validation failed at '{e.path}': {e.message}"
continue
raise AgentValidationError(
f"Agent failed to produce valid output after {max_attempts} attempts. "
f"Last error: {last_error}. Last response: {last_response_text}"
)
Step 7: Use the validated wrapper in your application
Call query_with_schema with a query and a schema. The function returns a parsed and validated dict, ready for downstream code.
# main.py
from validated_agent import query_with_schema, AgentValidationError
from schemas import CUSTOMER_ISSUE_SCHEMA
email_text = """
Hi there, my name is Sarah Chen and I placed order ORD-348201 last week.
The package arrived today but the wrong product was inside. I need this
resolved by Friday because it's a gift. Please help.
"""
try:
result = query_with_schema(
query=f"Extract the structured customer issue from this email:\n\n{email_text}",
schema=CUSTOMER_ISSUE_SCHEMA,
)
print(f"Customer: {result['customer_name']}")
print(f"Order: {result['order_number']}")
print(f"Category: {result['category']}")
print(f"Urgency: {result['urgency']}")
except AgentValidationError as e:
print(f"Failed to extract: {e}")
Step 8: Add retry telemetry
Track how often each schema requires retries. Schemas that need frequent retries indicate either a too-strict schema or a poorly-worded prompt. Either way, the data tells you where to improve.
# telemetry.py
import json
import os
from datetime import datetime
LOG_PATH = os.environ.get("VALIDATION_LOG", "validation_telemetry.jsonl")
def log_attempt(schema_name: str, attempt: int, success: bool, error: str = None):
entry = {
"timestamp": datetime.utcnow().isoformat() + "Z",
"schema_name": schema_name,
"attempt": attempt,
"success": success,
}
if error:
entry["error"] = error
with open(LOG_PATH, "a") as f:
f.write(json.dumps(entry) + "\n")
Update validated_agent.py to call log_attempt after each attempt, passing a schema_name parameter through query_with_schema.
Breakage
Skip the schema validator. Trust the agent to return JSON because the prompt says to return JSON. Most of the time, it works. Then a customer email arrives with formatting that confuses the agent. The agent returns JSON with urgency: "very high" instead of an integer. Your downstream code expects an integer and crashes. You add a try-except. The except block logs the failure and substitutes a default urgency of 3. Your support team gets a flood of urgency-3 tickets that should have been urgency-5. The data is wrong. The pipeline keeps running. You only notice when the customer who was about to churn gets de-prioritized for a week.
- Caller sends query to Unvalidated wrapper
- Unvalidated wrapper calls Anthropic API
- Anthropic API returns text response of varying quality
- Downstream parser parses successfully sometimes, fails sometimes
- Try-except fallback substitutes default values on failure
- Corrupted pipeline cannot distinguish real data from fallbacks
- Bad decisions made on corrupted data
The fix
The fix is the validated wrapper from Step 6. The retry loop is the mechanism: when the agent produces output that fails validation, the wrapper does not return the bad output and does not silently substitute defaults. It re-prompts the agent with the specific validation error and the previous response, and the agent corrects itself. The critical part of the loop is isolated below.
# The retry mechanism from validated_agent.py
def query_with_schema(query: str, schema: dict, max_attempts: int = 3) -> dict:
system = build_system_prompt(schema)
user_message = query
last_response_text = None
last_error = None
for attempt in range(1, max_attempts + 1):
# On retry: include the previous response and the validation error.
# The agent learns from its own mistake within the same call.
if attempt > 1 and last_response_text is not None:
user_message = (
f"{query}\n\n"
f"Your previous attempt:\n{last_response_text}\n\n"
f"Validation error: {last_error}\n\n"
f"Return a corrected JSON object that matches the schema."
)
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system=system,
messages=[{"role": "user", "content": user_message}],
)
text = response.content[0].text
last_response_text = text
try:
parsed = parse_response(text)
validate_against_schema(parsed, schema)
return parsed # Only return validated output
except (json.JSONDecodeError, ValidationFailed) as e:
last_error = str(e)
# Hard failure after max attempts: never return invalid output.
raise AgentValidationError(
f"Failed after {max_attempts} attempts. Last error: {last_error}."
)
The loop never returns invalid output. Either the agent converges to valid output within max_attempts, or the wrapper raises an error. There is no silent fallback. Downstream code only ever receives data that matches the schema.
Fixed state
- Caller invokes wrapper with query and schema
- Prompt composer builds system prompt with schema embedded
- Anthropic API returns response
- JSON parser attempts to parse the response
- If parse fails: Retry composer builds new prompt and loop continues
- If parse succeeds: Schema validator runs
- If validation fails: Retry composer builds new prompt with the specific error
- If validation succeeds: Validated output returns to Caller
- After N failed attempts: Hard failure raised
- Every attempt logged to Telemetry log
After
You call query_with_schema with a customer email and the issue extraction schema. The agent returns JSON. The validator checks it. Valid. Your downstream code receives a dict with the right shape, the right types, and no extra fields. You call it a thousand times across a day. The retry counter shows 940 first-attempt successes, 55 second-attempt successes, 5 third-attempt successes, and zero hard failures. Your downstream pipeline never crashes. Your data quality is consistent because every record passes the same structural check. You add a new schema for ticket prioritization. You add another for refund eligibility scoring. Each one is reliable from day one because the wrapper enforces the contract. The agent is fast and structured. Your code stops needing defensive parsing.
Takeaway
The pattern is contract enforcement at the boundary. Define the contract as a schema. Validate every output against the schema. Retry with the error as feedback when validation fails. Never silently substitute. Apply this to any LLM output that downstream code depends on for structure: extraction, classification, decision-making, tool selection. The retry loop is what turns a probabilistic system into a reliable one. The schema is what makes the contract explicit. Together, they let you treat agent output as if it were the output of a well-typed function.