1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
//! Floating-point helpers that the engine routes through to give
//! consumers a single switch for cross-host determinism.
//!
//! By default these are zero-cost wrappers — `fma` lowers to
//! `f64::mul_add`, which on hardware-FMA targets is one rounded
//! operation. With the `deterministic-fp` feature flag set, `fma`
//! becomes `(a * b) + c` (two rounded operations) — slightly less
//! precise, but bit-identical across every target without relying
//! on the host's libm or hardware FMA. Required for consumers
//! running lockstep across heterogeneous hosts (native + wasm32,
//! browser + worker, etc.) who need byte-equal sim outputs.
//!
//! All `f64::mul_add` call sites in the engine should go through
//! `fma` so the feature flag covers the whole hot path.
/// Fused multiply-add: returns `(a * b) + c`.
///
/// Default build uses hardware FMA via `f64::mul_add` — one rounded
/// operation, slightly more precise. With the `deterministic-fp`
/// feature flag, this expands to a separate multiply and add — two
/// rounded operations, bit-identical across all hosts and toolchains.
/// `deterministic-fp` build of [`fma`]: two rounded operations,
/// bit-identical across all targets. The `clippy::suboptimal_flops`
/// allow is the whole point of the feature flag — clippy would
/// suggest `mul_add`, which is exactly what we're avoiding.