Skip to main content

Module data

Module data 

Source
Expand description

Comprehensive data validation system with schema validation and constraint enforcement Comprehensive Data Validation System

Production-grade data validation system for SciRS2 Core providing schema validation, constraint enforcement, and data integrity checks for scientific computing applications in regulated environments.

§Features

  • JSON Schema validation with scientific extensions
  • Constraint-based validation (range, format, pattern)
  • Composite constraints with logical operators (AND, OR, NOT, IF-THEN)
  • Data integrity verification with checksums
  • Type safety validation for numeric data
  • Custom validation rules and plugins
  • Performance-optimized validation pipelines
  • Integration with ndarray for array validation
  • Support for complex nested data structures
  • Validation caching for repeated validations
  • Detailed error reporting with context
  • ConstraintBuilder for fluent constraint composition

§Example

use scirs2_core::validation::data::{Validator, ValidationSchema, ValidationConfig, DataType, Constraint};
use ::ndarray::Array2;

// Create a validation schema
let schema = ValidationSchema::new()
    .require_field("name", DataType::String)
    .require_field("age", DataType::Integer)
    .add_constraint("age", Constraint::Range { min: 0.0, max: 150.0 })
    .require_field("data", DataType::Array(Box::new(DataType::Float64)));

let config = ValidationConfig::default();
let validator = Validator::new(config)?;

// For JSON validation (when serde feature is enabled)

{
    let data = serde_json::json!({
        "name": "Test Dataset",
        "age": 25,
        "data": [[1.0, 2.0], [3.0, 4.0]]
    });

    let result = validator.validate(&data, &schema)?;
    if result.is_valid() {
        println!("Data is valid!");
    } else {
        println!("Validation errors: {:#?}", result.errors());
    }
}

§Using Composite Constraints

The validation system now supports composite constraints using logical operators:

use scirs2_core::validation::data::{Constraint, ConstraintBuilder, ValidationSchema, DataType};

// Create complex constraints using the builder
let age_constraint = ConstraintBuilder::new()
    .range(18.0, 65.0)
    .not_null()
    .and();

// Use logical operators for conditional validation
let email_or_phone = Constraint::Or(vec![
    Constraint::Pattern(r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$".to_string()),
    Constraint::Pattern(r"^\+?[1-9]\d{1,14}$".to_string()),
]);

// Conditional constraints: if age > 18, require consent field
let consent_constraint = Constraint::if_then(
    Constraint::Range { min: 18.0, max: f64::INFINITY },
    Constraint::NotNull,
    None
);

let schema = ValidationSchema::new()
    .require_field("age", DataType::Integer)
    .add_constraint("age", age_constraint)
    .require_field("contact", DataType::String)
    .add_constraint("contact", email_or_phone);

§Performance Features

The validation system includes several performance optimizations:

  • Validation Caching: Results are cached for repeated validations with configurable TTL
  • Parallel Validation: Array elements can be validated in parallel when enabled
  • Early Exit: Validation stops at first error when configured for fail-fast mode
  • Lazy Evaluation: Composite constraints evaluate only as needed
  • Memory Efficiency: Streaming validation for large datasets
use scirs2_core::validation::data::ValidationConfig;

let mut config = ValidationConfig::default();
config.strict_mode = true; // Fail fast on first error
config.enable_caching = true; // Enable result caching
config.cache_size_limit = 1000; // Cache up to 1000 results
config.enable_parallel_validation = true; // Parallel array validation
config.performance_mode = true; // Optimize for speed

Re-exports§

pub use config::ErrorSeverity;
pub use config::QualityIssueType;
pub use config::ValidationConfig;
pub use config::ValidationErrorType;
pub use schema::DataType;
pub use schema::FieldDefinition;
pub use schema::ValidationSchema;
pub use constraints::ArrayValidationConstraints;
pub use constraints::Constraint;
pub use constraints::ConstraintBuilder;
pub use constraints::ElementValidatorFn;
pub use constraints::ShapeConstraints;
pub use constraints::SparseFormat;
pub use constraints::StatisticalConstraints;
pub use constraints::TimeConstraints;
pub use errors::ValidationError;
pub use errors::ValidationResult;
pub use errors::ValidationStats;
pub use quality::DataQualityReport;
pub use quality::QualityAnalyzer;
pub use quality::QualityIssue;
pub use quality::QualityMetrics;
pub use quality::StatisticalSummary;
pub use array_validation::ArrayValidator;
pub use validator::ValidationRule;
pub use validator::Validator;

Modules§

array_validation
Array validation functionality for ndarray types
config
Configuration types for data validation
constraints
Constraint types and validation logic
errors
Error types and validation results
quality
Data quality assessment and reporting
schema
Schema definition and types for data validation
validator
Main validator implementation

Type Aliases§

Array1
Array2