Module fixed

Expand description

Q16.16 signed fixed-point arithmetic.

Why fixed-point: the CPU reference path and the CUDA kernels must produce byte-identical outputs cell-for-cell so the hash chain in the case file is the same regardless of whether evidence was produced on the host or the device. Floating point makes byte-equivalence fragile across compilers, drivers, fused-multiply-add behavior, and reduction order. Q16.16 with explicit rounding sidesteps all of that.

Why Q16.16 specifically: 16.16 strikes a balance between dynamic range (i32 covers roughly ±32_767 in the integer half) and precision (1/65_536 in the fractional half) that fits this workload — latencies in milliseconds, error-rate fractions, and smoothed-residual magnitudes all sit comfortably inside that band once inputs are clamped at the boundary.

Why these specific operations: sat_add, sat_sub, sat_mul, sat_div, and abs are the only operations the downstream pipeline needs. No FMA, no SIMD intrinsics. The same scalar code must run on the CPU and inside a CUDA kernel, so the implementation is restricted to plain integer math that both backends can express identically.

Rounding rule: multiplication widens to i64, then applies round-half-to- even (banker’s rounding) at bit 15 before shifting right by 16. This rule is symmetric, deterministic across architectures, and reduces the bias that round-half-up would inject into long EWMA recurrences.

Structs§

Q16: Signed Q16.16 fixed-point value.

Module fixed

Module fixed Copy item path

Structs§

Module fixed