datasynth-eval 3.1.1

Evaluation framework for synthetic financial data quality and coherence
Documentation

datasynth-eval

Evaluation framework for synthetic financial data quality and coherence.

Overview

datasynth-eval provides automated quality assessment for generated data:

  • Statistical Evaluation: Benford's Law compliance, distribution analysis
  • Coherence Checking: Balance verification, document chain integrity
  • Intercompany Validation: IC matching and elimination verification
  • Uniqueness Analysis: Duplicate detection across datasets

Evaluation Categories

Category Description
Statistical Benford's Law, amount distributions, temporal patterns
Coherence Trial balance, subledger reconciliation, FX consistency
Intercompany IC matching rates, elimination completeness
Uniqueness Document ID collisions, duplicate transaction detection
Banking/AML 10 dedicated analyzers covering KYC, typologies, velocity, networks, lifecycle

Banking Analyzers (banking/)

Analyzer What it validates
KycCompletenessAnalyzer Core KYC field coverage (name, DOB, ID, risk rating, beneficial owners)
AmlDetectabilityAnalyzer Typology coverage + scenario case_id coherence
CrossLayerCoherenceAnalyzer Payment↔BankTransaction referential integrity, fraud propagation rate
VelocityQualityAnalyzer Rolling-window ordering invariants (1h≤24h≤7d≤30d), z-score calibration
FalsePositiveAnalyzer FP rate bounds, label mutual exclusivity, reason coverage
DeviceFingerprintAnalyzer Power-law device distribution, single-device dominance, trust calibration
SanctionsScreeningAnalyzer Low-risk Clear rate, high-risk match rate, PEP name variations
SophisticationAnalyzer Sophistication level diversity, context-appropriate skew
LifecycleAnalyzer Phase diversity, progression rate, event-driven transition rate
NetworkStructureAnalyzer Power-law topology (hub ratio ≥2.5× avg), role diversity

Usage

use datasynth_eval::{Evaluator, EvaluationConfig};

let evaluator = Evaluator::new(EvaluationConfig::default());
let report = evaluator.evaluate(&generated_data)?;

println!("Benford compliance: {:.2}%", report.benford_score * 100.0);
println!("Balance coherence: {}", report.balance_check.passed);

Evaluation Report

The evaluation produces a comprehensive report including:

  • Pass/Fail Status: Overall and per-category
  • Scores: Numerical scores for statistical measures
  • Warnings: Potential issues that don't fail validation
  • Details: Specific findings and recommendations

License

Apache-2.0 - See LICENSE for details.