Expand description
Core validation types for the Term data quality library.
This module provides the fundamental types for defining and executing data validation suites, including checks, constraints, and results.
§Overview
The core module contains the essential building blocks for data validation:
ValidationSuite: A collection of checks to run against your dataCheck: A named group of related constraints with a severity levelConstraint: Individual validation rules (implemented in theconstraintsmodule)Level: Severity levels for checks (Error, Warning, Info)ValidationResult: Results from running a validation suite
§Architecture
ValidationSuite
├── Check (Level: Error)
│ ├── Constraint 1
│ └── Constraint 2
└── Check (Level: Warning)
├── Constraint 3
└── Constraint 4§Example
use term_guard::core::{ValidationSuite, Check, Level, ValidationResult};
use term_guard::core::builder_extensions::{CompletenessOptions, StatisticalOptions};
use term_guard::constraints::{Assertion, FormatType, FormatOptions};
use datafusion::prelude::*;
// Build a validation suite using the unified API
let suite = ValidationSuite::builder("customer_validation")
.description("Validate customer data quality")
.check(
Check::builder("critical_fields")
.level(Level::Error)
.description("Critical fields must be valid")
// Unified API for completeness
.completeness("customer_id", CompletenessOptions::full().into_constraint_options())
.completeness("email", CompletenessOptions::threshold(0.99).into_constraint_options())
// Convenience method for primary key validation
.primary_key(vec!["customer_id"])
.build()
)
.check(
Check::builder("data_quality")
.level(Level::Warning)
// Format validation API
.has_format("email", FormatType::Email, 0.95, FormatOptions::default())
// Combined statistics in one query
.statistics(
"age",
StatisticalOptions::new()
.min(Assertion::GreaterThanOrEqual(18.0))
.max(Assertion::LessThan(120.0))
)?
.build()
)
.build();
// Create context and register data
let ctx = SessionContext::new();
// ... register your customer table ...
// Run validation
let results = suite.run(&ctx).await?;
// Process results
match results {
ValidationResult::Success { report, .. } => {
println!("All validations passed!");
for issue in &report.issues {
if issue.level == Level::Warning {
println!("Warning: {}", issue.message);
}
}
}
ValidationResult::Failure { report } => {
println!("Validation failures:");
for issue in &report.issues {
if issue.level == Level::Error {
println!("Error in '{}': {}", issue.check_name, issue.message);
}
}
}
}§Constraint Status
Each constraint evaluation returns a status:
- Success: The constraint passed
- Failure: The constraint failed
- Skipped: The constraint was skipped (e.g., no data)
§Performance Considerations
- Constraints within a check may be optimized to run in a single query
- Use the
with_optimizer(true)option on ValidationSuite for best performance - Group related constraints in the same Check when possible
Re-exports§
pub use validation_context::current_validation_context;pub use validation_context::ValidationContext;pub use validation_context::CURRENT_CONTEXT;
Modules§
- builder_
extensions - Extended builder API for unified constraints.
- validation_
context - Validation context for passing runtime information to constraints.
Structs§
- Cache
Stats - Cache statistics for monitoring.
- Check
- A validation check containing one or more constraints.
- Check
Builder - Builder for constructing
Checkinstances. - Constraint
Metadata - Metadata associated with a constraint.
- Constraint
Options - Common options for constraint configuration.
- Constraint
Result - The result of evaluating a constraint.
- Debug
Context - Debug context for validation execution.
- Debug
Info - Debug information collected during validation execution.
- Debug
Summary - Summary of debug information.
- Error
Report - Detailed error report for failed validations.
- Logical
Result - Result of evaluating a logical expression.
- Multi
Source Validator - Multi-source validation engine for cross-table data validation.
- Multi
Table Check - Fluent builder for multi-table validation checks.
- Term
Context - A managed DataFusion context for Term validation operations.
- Term
Context Config - Configuration for creating a
TermContext. - Unified
Completeness Base - Base implementation for unified completeness-style constraints.
- Validation
Issue - A detailed validation issue found during checks.
- Validation
Metrics - Metrics collected during validation.
- Validation
Report - A validation report containing all issues found.
- Validation
Suite - A collection of validation checks to be run together.
- Validation
Suite Builder - Builder for constructing
ValidationSuiteinstances.
Enums§
- Column
Spec - Specification for columns in a constraint.
- Constraint
Status - The status of a constraint evaluation.
- Debug
Level - Debug level for validation execution.
- Level
- The severity level of a validation check.
- Logical
Operator - Logical operators for combining multiple boolean results.
- Validation
Result - The result of running a validation suite.
Traits§
- Check
Multi Table Ext - Extension trait for Check to provide fluent multi-table methods.
- Constraint
- A validation constraint that can be evaluated against data.
- Constraint
Options Builder - Builder pattern for constraint options.
- Unified
Constraint - Base trait for unified constraints.
- Validation
Result Debug Ext - Extension trait for ValidationResult to add debug information.