Structured Output Validator
Build a type-safe LLM output validation system that converts unpredictable model responses into reliable, schema-validated JSON.
Blueprint Information Only
This project's code implementation will come later in the curriculum. However, the complete architecture blueprint, functional specifications, core modules, milestones, and interview design explanations are fully active and available below to aid in your study.
Project Overview
Structured Output Validator is a production-style AI engineering project that teaches how to safely consume LLM responses inside real software systems. LLMs naturally generate free-form text, but production applications often need strict JSON structures. This project shows how to request structured output, parse raw model responses, validate them using schemas, handle malformed responses, retry failed generations, and return reliable typed data to the frontend.
Problem Statement
LLMs are excellent at generating natural language, but real applications cannot depend on raw text responses. A backend service, fraud detector, resume matcher, SEO analyzer, product insight dashboard, or agent workflow usually needs predictable structured data. This project solves the problem of converting unstructured LLM completions into schema-safe JSON using validation models, retry policies, repair prompts, and frontend inspection tools.
"Yes, this looks like a scam because the message asks for urgent payment."
{
"classification": "scam",
"confidence": 0.87,
"risk_factors": ["Urgency", "Payment request", "Unknown sender"],
"safe_action": "Do not click the link"
}What You Will Build
1. Prompt Input Playground
User can enter raw text, email, message, review, resume content, or any sample input for parsing.
2. Schema Selector
User can select predefined schemas including Scam Detection, Product Review Insight, Resume Match, or Generic JSON Extraction.
3. LLM Response Generator
Backend routes prompts to the LLM provider, injecting schema instructions to enforce structured completions.
4. JSON Validator
Backend validates the raw text completions using Pydantic model configurations.
5. Error Inspector
Provides full debugging views displaying missing fields, wrong data types, invalid enum selections, and out-of-range confidence scores.
6. Retry & Repair Engine
Retries or repairs bad completions using a feedback loop that sends errors back to the model for correction.
7. Validated Output Viewer
Displays final valid JSON, validation logs, raw model output text, and total repair iteration attempts.
Concepts You Will Learn
1. Why LLM Output Is Unreliable
Models are probabilistic and do not guarantee syntax validity. They can return extra text, unescaped characters, or hallucinated fields.
2. Prompt JSON vs JSON Mode vs Function Calling
Compare baseline prompt constraints, model JSON mode enforcements, and native tool-calling parameter mappings.
3. Schema-Based Validation
Define schemas that incoming raw outputs must satisfy, shifting evaluation checks to structured data.
4. Pydantic Models
Use Python's standard parsing library to declare strong types, enums, boundaries, and custom validations.
5. Type Safety in AI APIs
Compile and enforce strict TypeScript/Python contracts, converting raw inference responses into validated objects.
6. Retry Policies
Construct client-side retry rules to handle network drops, timeouts, and temporary validation failures.
7. Repair Prompts
Construct dynamic corrections prompting the model to repair malformed JSON based on parsed compiler error logs.
8. Guardrails
Incorporate safety check blocks blocking toxic or malicious outputs prior to database updates.
9. Production AI Reliability
Implement strategies like schema fallback defaults and circuit breakers to guarantee runtime stability.
10. Structured Outputs for Agents & RAG
How structured outputs enable reliable agent tool calls, decision routing, and semantic data extractions.
Project Requirements
Functional Requirements
| ID | Requirement Description |
|---|---|
| FR-01 | User can enter an input prompt/message |
| FR-02 | User can select a predefined output schema |
| FR-03 | System sends request to LLM with structured output instruction |
| FR-04 | System receives raw LLM response |
| FR-05 | System parses raw response into JSON |
| FR-06 | System validates response against Pydantic schema |
| FR-07 | System shows valid JSON output if validation passes |
| FR-08 | System shows validation errors if validation fails |
| FR-09 | System retries failed output generation based on retry policy |
| FR-10 | System attempts repair for malformed JSON |
| FR-11 | System displays retry count, validation status, and error reason |
| FR-12 | User can compare raw response vs validated response |
Non-Functional Requirements
| ID | Requirement Description |
|---|---|
| NFR-01 | API response should be fast for small inputs |
| NFR-02 | Validation errors should be readable |
| NFR-03 | System should not expose API keys to frontend |
| NFR-04 | Frontend should handle loading, success, and error states |
| NFR-05 | Backend should log validation failures |
| NFR-06 | System should be extensible for new schemas |
| NFR-07 | UI should work well on desktop and mobile |
| NFR-08 | Code should be interview-ready and easy to explain |
| NFR-09 | Backend should handle invalid user input safely |
| NFR-10 | Project should support future LLM provider changes |
System Architecture
The frontend never talks directly to the LLM provider. It sends the user input and selected schema to the FastAPI backend. The backend builds the prompt, calls the LLM, parses the response, validates it using Pydantic, and returns either a valid typed JSON response or detailed validation errors.
Backend Design
apps/api
├── main.py
├── routes
│ └── validate.py
├── schemas
│ ├── scam_detection.py
│ ├── product_review.py
│ ├── resume_match.py
│ └── generic_extraction.py
├── services
│ ├── llm_client.py
│ ├── prompt_builder.py
│ ├── output_parser.py
│ ├── validator.py
│ └── repair_engine.py
├── core
│ ├── config.py
│ └── errors.py
└── tests
├── test_validator.py
└── test_repair_engine.pyApplication entrypoint initializing FastAPI instance and routes middleware.
BFF routes processing requests, mapping configurations, and running validations.
Pydantic models folder containing ScamDetectionOutput, ReviewInsight, and ResumeMatch models.
API proxy wrapper directing queries to OpenAI or alternative LLM API.
Self-correcting repair loop requesting model updates with error context logs.
Frontend Design
apps/web
├── app
│ └── structured-output
│ └── page.tsx
├── components
│ ├── PromptInput.tsx
│ ├── SchemaSelector.tsx
│ ├── ValidationResult.tsx
│ ├── JsonViewer.tsx
│ ├── ErrorInspector.tsx
│ └── RetryTimeline.tsx
├── lib
│ └── api.ts
└── types
└── structured-output.tsSyllabus details container embedding layout panels.
User input form handling long text payloads.
Dropdown element switching active validation types.
Debugging display listing error logs and paths.
Logs display tracing repair operations.
API Contract
{
"input": "Your account will be blocked. Click this link immediately to verify payment.",
"schema_type": "scam_detection",
"max_retries": 2
}{
"status": "valid",
"schema_type": "scam_detection",
"retry_count": 0,
"data": {
"classification": "scam",
"confidence": 0.92,
"risk_factors": [
"Urgency",
"Suspicious link",
"Account threat"
],
"safe_action": "Do not click the link"
}
}{
"status": "invalid",
"schema_type": "scam_detection",
"retry_count": 2,
"errors": [
{
"field": "confidence",
"message": "Value must be between 0 and 1"
}
],
"raw_output": "{ \"classification\": \"scam\", \"confidence\": 120 }"
}Example Pydantic Model
Pydantic Model Schema
from typing import Literal
from pydantic import BaseModel, Field
class ScamDetectionOutput(BaseModel):
classification: Literal["scam", "safe", "suspicious"]
confidence: float = Field(ge=0, le=1)
risk_factors: list[str]
safe_action: strThis schema ensures that the LLM cannot return arbitrary or invalid data. The confidence field must stay between 0 and 1, and classification must match one of the allowed values.
Validation Lifecycle
Step 1: User submits input prompt
Step 2: Backend selects schema config
Step 3: Prompt builder injects schema instructions
Step 4: LLM generates response completion
Step 5: Parser extracts JSON brackets
Step 6: Pydantic validates data properties
Step 7: If valid, return typed response
Step 8: If invalid, retry or repair using compiler logs
Step 9: Return final validation status to frontend
Common Failure Cases
Malformed JSON
The model returns text before or after the JSON braces block.
Missing Required Field
The model fails to output a required property defined in the schema.
Wrong Data Type
The model returns confidence as a string (e.g. '0.9') instead of a float number.
Invalid Enum Value
The model returns an invalid enum value (e.g., 'dangerous') instead of 'scam', 'safe', or 'suspicious'.
Hallucinated Fields
The model adds extra fields not expected by the Pydantic schema.
Invalid Range Constraints
The model returns confidence as 120 instead of a value between 0 and 1.
Interview Explanation
How to present this project:
“In production AI systems, we cannot directly trust raw LLM text. I built a structured output validation layer where the backend defines strict schemas using Pydantic. The LLM response is parsed, validated, and either accepted, repaired, retried, or rejected. This makes the AI system safer, type-safe, easier to debug, and suitable for downstream automation.”
Possible Interview Questions
- ?Why is raw LLM output risky in production?
- ?What is the difference between JSON mode and schema validation?
- ?How does Pydantic help in AI applications?
- ?How would you handle malformed JSON from an LLM?
- ?What retry strategy would you use for failed validation?
- ?How do structured outputs help agents and RAG pipelines?
- ?How would you monitor validation failures in production?
- ?What is the difference between validation and guardrails?
- ?How would you design this for multiple schema types?
- ?How would you safely expose this system to the frontend?
After Building This Project, You Can Explain
- Why raw LLM output is risky in software systems
- Why structured output is needed and how to design it
- How schema validation works at the API gateway layer
- How Pydantic validates LLM responses in Python
- How retry and repair logic improves AI reliability
- How frontend interfaces inspect validation success/failure
- How to explain this project during technical AI interviews
Build Milestones
Milestone 1: Static UI Blueprint
Build the page layout, input panel, schema selector, and output viewer.
Milestone 2: Backend API Setup
Create FastAPI endpoint routing logic and request/response schemas.
Milestone 3: Pydantic Validation
Add schemas for scam detection, review insights, resume match, and generic JSON extraction.
Milestone 4: LLM Integration
Connect backend service wrappers to OpenAI or any compatible LLM endpoint.
Milestone 5: Retry and Repair Engine
Implement retry logics and recursive repair prompt generation.
Milestone 6: Frontend Integration
Hook UI components to backend endpoints and display valid/invalid states.
Milestone 7: Production Documentation
Add README guidelines, architecture flows, and deploy configurations.
Future Improvements
- •Add multiple LLM provider support (Anthropic, Gemini, local models)
- •Add streaming output validation handling chunked tokens in flight
- •Add schema generation from natural language input descriptions
- •Add validation analytics dashboard monitoring latency and accuracy
- •Add saved validation history tracking logs
- •Add user-defined custom schemas sandbox
- •Add LangChain / Instructor / Guardrails comparison reference documentation
- •Add agent tool-call validation support mapping parameters
- •Add observability telemetry tracking validation failure rates
- •Add cost tracking for retry attempts
Project Status
Key Skills
- Structured LLM output design
- JSON parsing
- Schema validation
- Pydantic models
- Retry policies
- Repair prompts
- AI API design
- Frontend error inspection
- Production AI reliability
Tech Stack
Interview Value
- Critical senior systems designer skill. Demonstrates ability to turn probabilistic AI outputs into predictable, type-safe structures.