Module validator_pattern

Expand description

Validator-Pattern Verification (Issue #76)

This module implements the certifying-algorithm verification pattern that CompCert uses for NP-hard passes (register allocation, instruction scheduling). Rather than verifying the selector — which is heuristic, large, and changes frequently — we verify a small validator that consumes the selector’s output and confirms each concrete selection is semantically equivalent to the WASM op it replaces.

See docs/validator-pattern.md for the architecture rationale.

§Status

The prototype landed by PR #113 covered only WasmOp::I32Add. This module extends the coverage to the full i32 + i64 arithmetic / logic / shift / comparison surface (issue #76). The pieces are:

CertifiedSelection — the per-op output of the selector
Validator — the trait the validator implements
Z3ArmValidator — a Z3-backed validator

§How the validator works

For each WASM op the validator builds a Z3 query that asserts the negation of equivalence between the WASM op’s reference semantics and the ARM instruction sequence the selector emitted:

assert ¬( wasm_reference(inputs) == arm_lowering(inputs) )
check-sat

unsat means no input distinguishes the two — the selection is certified. sat means Z3 found a concrete counterexample — the selection is rejected.

§Self-contained semantics

The validator builds both the WASM reference and the ARM lowering semantics directly in this module. It deliberately does not delegate to crate::WasmSemantics / crate::ArmSemantics for i64: those encoders model i64 with truncated 32-bit semantics (see wasm_semantics.rs — every i64 op is a 32-bit stand-in), which is precisely the gap #76 closes. A validator that trusted a truncated model would certify wrong lowerings.

Instead, every formula here is written in terms of 32-bit Z3 bitvectors — the same width the ARM machine actually has. i64 values are carried as a (lo, hi) pair of 32-bit bitvectors, exactly the register pair the ARM lowering uses. A reviewer can check each formula against the textbook 32-bit-limb algorithm for the corresponding 64-bit operation.

§i64 register-pair modeling

synth lowers a WASM i64 onto two ARM registers: lo (bits 0..=31) and hi (bits 32..=63). The validator never builds a 64-bit bitvector for the arithmetic/logic/shift surface. Instead, for each i64 op it builds the 64-bit result as a pair of 32-bit limbs, with carry / borrow propagated explicitly between the limbs:

add: lo = a_lo + b_lo, carry = (lo <u a_lo), hi = a_hi + b_hi + carry.
sub: lo = a_lo - b_lo, borrow = (a_lo <u b_lo), hi = a_hi - b_hi - borrow.
mul: schoolbook a_lo*b_lo plus the carry-out of that partial product plus the cross terms a_lo*b_hi + a_hi*b_lo (mod 2^32).
logic (and/or/xor): limb-wise, no interaction between limbs.
shifts: case-split on whether the (mod-64) shift amount is < 32 or >= 32, with the cross-limb bit transfer made explicit.
comparisons: hi decides; lo breaks ties — written exactly so a reviewer sees the lexicographic structure.

Because the ARM lowering for an i64 op is itself a sequence of 32-bit ARM instructions, the validator compares the selector’s emitted sequence limb-for-limb against this reference, and Z3 proves the two agree for all 2^64 inputs.

§Scope

Covered: i32 and i64 add / sub / mul / and / or / xor / shl / shr_s / shr_u / rotl / rotr; i32 and i64 eq / ne / lt_{s,u} / le_{s,u} / gt_{s,u} / ge_{s,u}; i32 and i64 eqz; i32 div_s / div_u.

Scoped out, with reasons (see docs/validator-pattern.md):

i32 rem_s / rem_u — the ARM lowering is SDIV/UDIV followed by MLS, i.e. the identity r = a - (a / b) * b. Proving this equals WASM’s bvsrem / bvurem for all 2^64 input pairs requires Z3 to reason about a symbolic 32-bit multiply (a / b) * b, which bit-blasts past any practical solver budget (no convergence in 5 minutes). div_s and div_u certify in well under a second — the divide instruction maps straight onto bvsdiv / bvudiv with no multiply — but remainder crosses the tractability line. Deferred until the validator can discharge the MLS identity with a multiplier-aware tactic rather than raw bit-blasting.
i64 div_s / div_u / rem_s / rem_u — synth lowers 64-bit division to a runtime-library call (__aeabi_ldivmod and friends); there is no fixed 32-bit ARM instruction sequence to symbolically execute. Modeling a fictitious native 64-bit divide would certify a lowering the compiler never emits. These are deferred until the validator can reason about helper-call summaries.
Clz / Ctz / Popcnt (i32 and i64) — bit-counting. The 32-bit ARM CLZ is available, but Ctz/Popcnt are lowered to multi-instruction bit-twiddling and i64.clz/ctz/popcnt combine two limbs conditionally. wasm_semantics.rs already has binary-search encoders for these; wiring them through the validator’s ARM-side executor is follow-up work and is not part of the #76 arithmetic/logic/shift/comparison core.
Extend8S / Extend16S (and i64 in-place extends) — sign-extension ops; the ARM SXTB/SXTH lowering is straightforward but, like the bit-counting ops, sits outside the #76 binary-op core and is deferred to keep this change focused and reviewable.

Structs§

CertifiedSelection: A concrete selection produced by the instruction selector.
Witness: A witness recording the validator’s acceptance of a selection.
Z3ArmValidator: Z3-backed validator for the WASM → ARM selection pattern.

Enums§

SolverResultKind: Solver result kind (mirrors crate::solver::CheckOutcome).
ValidationError: Validation failure reasons.

Traits§

Validator: A backend-agnostic validator for CertifiedSelection.