Validators¶
Validate LLM output for correctness and data quality.
Overview¶
Validators check extracted data against rules and can be composed into chains.
from strutex import SchemaValidator, SumValidator, ValidationChain
chain = ValidationChain([
SchemaValidator(),
SumValidator(tolerance=0.01),
])
result = chain.validate(data, schema)
if not result.valid:
print(result.issues)
Built-in Validators¶
SchemaValidator¶
Ensures output structure matches expected schema.
from strutex import SchemaValidator, Object, String, Number, Array
schema = Object(properties={
"invoice_number": String,
"total": Number,
"items": Array(items=Object(properties={
"amount": Number
}))
})
validator = SchemaValidator()
result = validator.validate(data, schema)
Checks:
- Required fields are present
- Field types match (string, number, boolean, array, object)
- Nested objects validated recursively (issues reported with full path, e.g.,
items.0.amount)
SumValidator¶
Verifies line items sum to stated total.
from strutex import SumValidator
validator = SumValidator(
items_field="line_items",
amount_field="price",
total_field="grand_total",
tolerance=0.01
)
result = validator.validate({
"line_items": [{"price": 10.00}, {"price": 20.00}],
"grand_total": 30.00
})
# result.valid == True
DateValidator¶
Validates date formats and ranges.
from strutex import DateValidator
validator = DateValidator(
date_fields=["invoice_date", "due_date"],
min_year=2020,
max_year=2030
)
result = validator.validate({
"invoice_date": "2024-01-15",
"due_date": "2024-02-15"
})
Accepted formats: ISO, European (DD.MM.YYYY), US (MM/DD/YYYY)
Validation Chains¶
Compose multiple validators:
from strutex import ValidationChain, SchemaValidator, SumValidator, DateValidator
chain = ValidationChain([
SchemaValidator(strict=True),
SumValidator(tolerance=0.01),
DateValidator(),
], strict=True) # Stop on first failure
result = chain.validate(data, schema)
print(result.valid) # True/False
print(result.issues) # List of error messages
print(result.data) # Possibly modified data
Modes:
strict=True— Stop on first failurestrict=False— Collect all issues
Creating Custom Validators¶
from strutex.plugins import Validator, ValidationResult
class EmailValidator(Validator, name="email"):
priority = 50
def validate(self, data, schema=None):
issues = []
email = data.get("email", "")
if email and "@" not in email:
issues.append(f"Invalid email: {email}")
return ValidationResult(
valid=len(issues) == 0,
data=data,
issues=issues
)
API Reference¶
SchemaValidator(strict: bool = False)
¶
Bases: Validator
Validates that extracted data matches the expected schema structure.
Checks: - Required fields are present - Field types match (string, number, boolean, array, object) - Nested objects are validated recursively
| ATTRIBUTE | DESCRIPTION |
|---|---|
strict |
If True, fail on extra fields not in schema
|
Initialize the schema validator.
| PARAMETER | DESCRIPTION |
|---|---|
strict
|
If True, reject data with fields not in schema
TYPE:
|
Source code in strutex/validators/schema.py
validate(data: Dict[str, Any], schema: Optional[Schema] = None, source_text: Optional[str] = None) -> ValidationResult
¶
Validate data against a schema.
| PARAMETER | DESCRIPTION |
|---|---|
data
|
The extracted data to validate
TYPE:
|
schema
|
The expected schema structure
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
ValidationResult
|
ValidationResult with validation status and any issues |
Source code in strutex/validators/schema.py
options: show_root_heading: true
SumValidator(items_field: str = 'items', amount_field: str = 'amount', total_field: str = 'total', tolerance: float = 0.01, strict: bool = False)
¶
Bases: Validator
Validates that line item amounts sum to the stated total.
Common use case: Invoice validation where item totals should match the invoice total.
| ATTRIBUTE | DESCRIPTION |
|---|---|
items_field |
Field name containing the list of items
|
amount_field |
Field name in each item containing the amount
|
total_field |
Field name containing the expected total
|
tolerance |
Acceptable difference (for floating point comparison)
|
strict |
If True, fail when required fields are missing
|
Initialize the sum validator.
| PARAMETER | DESCRIPTION |
|---|---|
items_field
|
Name of the field containing line items
TYPE:
|
amount_field
|
Name of the amount field in each item
TYPE:
|
total_field
|
Name of the total field
TYPE:
|
tolerance
|
Maximum acceptable difference
TYPE:
|
strict
|
If True, fail validation when items or total are missing
TYPE:
|
Source code in strutex/validators/sum.py
validate(data: Dict[str, Any], schema: Any = None, source_text: Optional[str] = None) -> ValidationResult
¶
Validate that line items sum to the total.
| PARAMETER | DESCRIPTION |
|---|---|
data
|
The extracted data to validate
TYPE:
|
schema
|
Not used by this validator
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
ValidationResult
|
ValidationResult indicating if sums match |
Source code in strutex/validators/sum.py
options: show_root_heading: true
DateValidator(date_fields: Optional[List[str]] = None, formats: Optional[List[str]] = None, min_year: int = 1900, max_year: int = 2100)
¶
Bases: Validator
Validates date fields for format and range.
Checks: - Date strings match expected formats - Dates are within acceptable range - Optional normalization to ISO format
| ATTRIBUTE | DESCRIPTION |
|---|---|
date_fields |
List of field names to validate
|
formats |
Accepted date formats (strptime patterns)
|
min_date |
Minimum acceptable date
|
max_date |
Maximum acceptable date
|
Initialize the date validator.
| PARAMETER | DESCRIPTION |
|---|---|
date_fields
|
Field names to validate (None = auto-detect)
TYPE:
|
formats
|
Accepted date formats
TYPE:
|
min_year
|
Minimum acceptable year
TYPE:
|
max_year
|
Maximum acceptable year
TYPE:
|
Source code in strutex/validators/date.py
validate(data: Dict[str, Any], schema: Any = None, source_text: Optional[str] = None) -> ValidationResult
¶
Validate date fields in the data.
| PARAMETER | DESCRIPTION |
|---|---|
data
|
The extracted data to validate
TYPE:
|
schema
|
Not used by this validator
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
ValidationResult
|
ValidationResult with validation status |
Source code in strutex/validators/date.py
options: show_root_heading: true
ValidationChain(validators: List[Validator], strict: bool = True)
¶
Composes multiple validators into a sequential chain.
Validators run in order. If any validator fails (in strict mode), the chain stops and returns the failure. In lenient mode, all validators run and issues are collected.
Example
Initialize the validation chain.
| PARAMETER | DESCRIPTION |
|---|---|
validators
|
List of validators to run in order
TYPE:
|
strict
|
If True, stop on first failure. If False, collect all issues.
TYPE:
|
Source code in strutex/validators/chain.py
add(validator: Validator) -> ValidationChain
¶
Add a validator to the chain.
| PARAMETER | DESCRIPTION |
|---|---|
validator
|
The validator to add
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
ValidationChain
|
Self for method chaining |
validate(data: Dict[str, Any], schema: Any = None, source_text: Optional[str] = None) -> ValidationResult
¶
Run all validators in the chain.
| PARAMETER | DESCRIPTION |
|---|---|
data
|
The data to validate
TYPE:
|
schema
|
Optional schema to pass to validators
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
ValidationResult
|
Combined ValidationResult from all validators |
Source code in strutex/validators/chain.py
options: show_root_heading: true