par-validator

Parallel validation for payment / trading-style payloads: run string and identifier checks on the CPU with Rayon, and pack fixed-point numerics (ranges, GST, FX conversion, simple interest) into one WebGPU compute dispatch per batch.

The crate is built around rust-key-paths for type-safe field access and wgpu for portable GPU execution (Vulkan, Metal, DX12, etc.).

Features

Layer	What	How
Strings / IDs	BIC-shaped data, IBAN-like lengths, UETR, product codes	`RuleBuilder` + `#[derive(Kp)]` + Rayon
Numerics	Range, %, tax in bps, FX ×1000 rate, interest	`GpuNumericEngine` + WGSL compute

Mandatory rules run in order and short-circuit on the first failure.
Other rules for the same key path run in parallel inside RuleBuilder::apply.
GPU: all values are i32 scaled ×100 end-to-end in the shader (no f32/f64 on the boundary), avoiding nondeterministic floats in validation math.

Quick start

cd par-validator
cargo run --example hybrid_gpu

Release builds are recommended for large batches:

cargo run --release --example fintech_rayon_nested
cargo run --release --example fintech_gpu_batch
cargo run --release --example fintech_hybrid_batch

Examples

Example	Focus
`hybrid_gpu`	Small walkthrough: Rayon + GPU together
`fintech_rayon_nested`	~4k nested ISO-20022–style transfers → flatten to key paths → CPU-only validation
`fintech_gpu_batch`	~12k numeric rules (2k legs × 6 checks) in one GPU `run`
`fintech_hybrid_batch`	~1k trade legs: Rayon on strings, single GPU batch for notionals / tax / FX

The nested Rayon example keeps a deep domain struct (BankParty, charges, FX block) and projects it to a flat #[derive(Kp)] view for validation. That matches how rust-key-paths KpType integrates with this crate today; composed .then() chains use a different internal representation and are better handled via flattening or dynamic key paths upstream.

Benchmarks

Criterion benchmarks live in par-validator/benches/throughput.rs.

cd par-validator
cargo bench --bench throughput

Results (reference machine)

Benchmark	What it measures	Typical time
`rayon_cpu_4096_transfers_x12_builders`	4 096 flat transfers × 12 [`RuleBuilder`] runs (Rayon over rows + inner `par_iter` in `apply`)	~2.86 ms
`wgpu_gpu/12288_numeric_rules_one_dispatch`	12 288 `NumericRule` rows in one `GpuNumericEngine::run`	~1.53 ms

Testing machine: MacBook Air M1, Apple Silicon (arm64), macOS, Rust 1.85, cargo bench release profile. Figures are Criterion medians from a single run; variance and outliers are normal—re-run locally for your hardware.

Library layout

RuleBuilder — CPU validation keyed by KpType.
gpu_numeric — NumericRule / NumericOutput, GpuNumericEngine::run, WGSL shader (portable wide integer math for Metal).

Full API docs:

cargo doc --open -p par-validator

Dependencies

rayon, rust-key-paths, wgpu, bytemuck, pollster (examples / blocking on async GPU init).

Dev-only: key-paths-derive for #[derive(Kp)] in examples.

License

See the repository LICENSE file.

par-validator 0.1.0