Outcome of comparing a measured GPU throughput against the target. The
only way to construct one is Self::from_measurement, so a verdict can
never assert a target that was not actually established by a measurement.
Inputs to [should_use_gpu_pirls_loop]. Each field comes from data the
CPU PIRLS entry has on hand before it touches the eigendecomposition
engine, so the admission check itself is allocation-free and can short-
circuit before any heavy work happens.
Inputs to [should_run_reml_outer_on_device]. The admission predicate
for routing the outer REML BFGS-over-ρ loop onto a fully device-resident
driver (rather than the host orchestrator that hops out per step).
The aspirational single-GPU design-row throughput the #1412 decision gate is
supposed to establish for the LLM-shape batched-Cholesky + tile-GEMM fit
pipeline: 100 000 design rows processed per wall-clock second per device.