par-validator
Parallel validation for payment / trading-style payloads: run string and identifier checks on the CPU with Rayon, and pack fixed-point numerics (ranges, GST, FX conversion, simple interest) into one WebGPU compute dispatch per batch.
The crate is built around rust-key-paths for type-safe field access and wgpu for portable GPU execution (Vulkan, Metal, DX12, etc.).
Beginner’s guide
This section is for you if you are new to the project or to Rust crates that mix CPU parallelism and GPU compute.
What you should already know
- Rust basics:
cargo,struct,enum,impl, and reading example binaries underexamples/. - Optional: A little exposure to shaders or WebGPU helps for the numeric path, but you can treat
GpuNumericEngineas a black box at first.
What this crate does (in one minute)
- String-style validation — You point at a field on a struct (via key paths from
#[derive(Kp)]), attach rule functions, and callapply. Mandatory rules run one after another; if one fails, the rest of that builder’s mandatory chain is skipped. Other rules for the same field run in parallel with Rayon. - Numeric validation and math — You fill a
Vec<NumericRule>with fixed-point integers (amounts scaled ×100). The GPU runs many rules in one dispatch and returns aVec<NumericOutput>(error code + optional calculated value).
So: texty things → CPU + Rayon, lots of fixed-point number rules → GPU batch.
Prerequisites
- Rust (stable, recent; edition 2024 is declared in
Cargo.toml— use a toolchain that supports it). - A working GPU stack for examples that call
GpuNumericEngine(e.g. Metal on Apple Silicon, Vulkan on many Linux setups, DX12 on Windows). If the GPU path fails, check drivers / OS support for wgpu.
First run (step by step)
-
Clone the repo and enter the crate directory (the one that contains
Cargo.toml): -
Optional — smallest sample (CPU only, no GPU):
-
Run the small end-to-end demo (strings on the CPU, numbers on the GPU):
-
When that works, try a CPU-only large batch:
-
Then a GPU-heavy batch:
Where to read the code (suggested order)
| Order | File | Why |
|---|---|---|
| 1 | examples/basics.rs |
Minimal Rule / builder only (no GPU) |
| 2 | examples/hybrid_gpu.rs |
Short story: Rule + GpuNumericEngine::run |
| 3 | src/builder.rs |
Rule: key path + bool predicates + apply |
| 4 | src/gpu_numeric.rs |
Rules, outputs, WGSL shader |
| 5 | examples/fintech_hybrid_batch.rs |
Bigger payload, both layers |
Concepts cheat sheet
| Term | Meaning |
|---|---|
Key path (Kp) |
Compile-time accessor to a field, e.g. MyStruct::field_name(), from key-paths-derive. |
Rule |
Holds a key path + mandatory / parallel fn(Option<&V>) -> bool rules, each with an error E; apply() returns Vec<E> for failures. |
| Mandatory rule | Runs first, in order; first failure stops the rest of the mandatory list for that builder. |
NumericRule |
One row: value ×100, rule kind, two integer parameters (meaning depends on kind). |
GpuNumericEngine::run |
Uploads all rules, runs compute shader once, downloads results (blocking). |
Flattening nested structs
Your business model can be deeply nested (parties, legs, charges). This crate’s Rule::new expects a KpType from plain #[derive(Kp)] fields. A common pattern (see fintech_rayon_nested) is a flat “view” struct that mirrors nested data for validation only.
If something goes wrong
cdtopar-validator— Commands assume the crate root next tosrc/andexamples/.- GPU panic / “no adapter” — Environment may lack a supported backend; try another machine or update OS/GPU drivers.
- Slower than expected — Use
cargo run --releasefor large examples and benchmarks.
Features
| Layer | What | How |
|---|---|---|
| Strings / IDs | BIC-shaped data, IBAN-like lengths, UETR, product codes | Rule + #[derive(Kp)] + Rayon |
| Numerics | Range, %, tax in bps, FX ×1000 rate, interest | GpuNumericEngine + WGSL compute |
- Mandatory rules run in order and short-circuit on the first failure.
- Other rules for the same key path run in parallel inside
Rule::apply. - GPU: all values are i32 scaled ×100 end-to-end in the shader (no
f32/f64on the boundary), avoiding nondeterministic floats in validation math.
Quick start
Basic example (CPU only)
The smallest program uses Rule on a #[derive(Kp)] struct — no GPU, no async, so it is easy to copy into your own crate.
Source: examples/basics.rs. Core idea:
use Kp;
use Rule;
let p = Payment ;
let results = new
.with_root
.mandatory_rule
.rule
.apply;
mandatory_ruleruns first; if it fails, you get a single-elementVecand parallel rules are skipped.ruleadds checks that run in parallel with each other insideapply().
Release builds are recommended for large batches:
Examples
| Example | Focus |
|---|---|
basics |
Starter: one field, mandatory + parallel rules, no GPU |
hybrid_gpu |
Small walkthrough: Rayon + GPU together |
fintech_rayon_nested |
~4k nested ISO-20022–style transfers → flatten to key paths → CPU-only validation |
fintech_gpu_batch |
~12k numeric rules (2k legs × 6 checks) in one GPU run |
fintech_hybrid_batch |
~1k trade legs: Rayon on strings, single GPU batch for notionals / tax / FX |
The nested Rayon example keeps a deep domain struct (BankParty, charges, FX block) and projects it to a flat #[derive(Kp)] view for validation. That matches how rust-key-paths KpType integrates with this crate today; composed .then() chains use a different internal representation and are better handled via flattening or dynamic key paths upstream.
Benchmarks
Criterion benchmarks live in par-validator/benches/throughput.rs.
Run only the large GPU case (handy on a Windows/Linux NVIDIA box with Vulkan or DX12):
Results (reference machine)
| Benchmark | What it measures | Typical time (M1 Air) |
|---|---|---|
rayon_cpu_4096_transfers_x12_rules |
4 096 flat transfers × 12 Rule runs (Rayon over rows + inner par_iter in apply) |
~2.86 ms |
wgpu_gpu/12288_numeric_rules_one_dispatch |
12 288 NumericRule rows in one GpuNumericEngine::run |
~1.53 ms |
wgpu_gpu_nvidia/nvidia_gpu_98304_numeric_rules_one_dispatch |
98 304 rules (16 384 legs × 6), one dispatch — stress size for discrete GPUs | ~3.44 ms |
Testing machine: MacBook Air M1, Apple Silicon (arm64), macOS, Rust 1.85, cargo bench release profile. Figures are Criterion medians from a single run; variance and outliers are normal—re-run locally for your hardware.
NVIDIA: On a PC with an NVIDIA GPU, install current drivers, use the same cargo bench --bench throughput -- nvidia_gpu command, and compare the reported time to the M1 row above (wgpu will typically use Vulkan or DX12).
Library layout
builder::Rule— CPU validation keyed byKpType(pub useaspar_validator::Rule).gpu_numeric—NumericRule/NumericOutput,GpuNumericEngine::run, WGSL shader (portable wide integer math for Metal).
Full API docs:
Dependencies
rayon,rust-key-paths,wgpu,bytemuck,pollster(examples / blocking on async GPU init).
Dev-only: key-paths-derive for #[derive(Kp)] in examples.
License
See the repository LICENSE file.