pub fn iterative_refinement_cholesky_solve(
hessian: ArrayView2<'_, f64>,
rhs: ArrayView2<'_, f64>,
need_logdet: bool,
) -> Result<(Array2<f64>, f64, Option<RefinementOutcome>), String>Expand description
Solve A x = b with fp32 Cholesky factorization + fp64-residual iterative
refinement, automatically falling back to fp64 when the policy rejects the
attempt or when the fp32 path fails / diverges.
The p threshold and maximum step count come from [GpuDispatchPolicy]
constants — there is no user-facing knob. The decision path is:
policy.iterative_refinement_should_attempt(p)→falseor multi-column RHS: skip to the fp64 Cholesky path.- Attempt fp32 POTRF + up to
REFINEMENT_MAX_STEPSresidual-correction steps. Falls back to fp64 on:- fp32 POTRF info ≠ 0 (A is not SPD at f32 precision),
- non-monotone residual (κ(A)·u_fp32 ≥ 1 regime).
- On fp32 success the logdet is computed from the fp64 Cholesky factor —
BUT only when
need_logdetis true. The fp64 POTRF is an O(p³) factorization that fully negates the mixed-precision speedup (the whole point is to do the expensive factor in fp32), so a caller that only needs the solution (e.g. the PIRLS Newton direction solve, which discards the logdet) passesneed_logdet = falseand the redundant fp64 POTRF is skipped entirely — the returned logdet isNaNin that case. The solution is always full-fp64-accurate via the residual refinement regardless.
Returns (solution, logdet, Some(RefinementOutcome)) when the fp32 path
succeeded, or (solution, logdet, None) on the fp64 fallback. When
need_logdet is false and the fp32 path succeeds, the logdet field is NaN.