Function iterative_refinement_cholesky_solve

Source

pub fn iterative_refinement_cholesky_solve(
    hessian: ArrayView2<'_, f64>,
    rhs: ArrayView2<'_, f64>,
    need_logdet: bool,
) -> Result<(Array2<f64>, f64, Option<RefinementOutcome>), String>

Expand description

Solve A x = b with fp32 Cholesky factorization + fp64-residual iterative refinement, automatically falling back to fp64 when the policy rejects the attempt or when the fp32 path fails / diverges.

The p threshold and maximum step count come from [GpuDispatchPolicy] constants — there is no user-facing knob. The decision path is:

policy.iterative_refinement_should_attempt(p) → false or multi-column RHS: skip to the fp64 Cholesky path.
Attempt fp32 POTRF + up to REFINEMENT_MAX_STEPS residual-correction steps. Falls back to fp64 on:
- fp32 POTRF info ≠ 0 (A is not SPD at f32 precision),
- non-monotone residual (κ(A)·u_fp32 ≥ 1 regime).
On fp32 success the logdet is computed from the fp64 Cholesky factor — BUT only when need_logdet is true. The fp64 POTRF is an O(p³) factorization that fully negates the mixed-precision speedup (the whole point is to do the expensive factor in fp32), so a caller that only needs the solution (e.g. the PIRLS Newton direction solve, which discards the logdet) passes need_logdet = false and the redundant fp64 POTRF is skipped entirely — the returned logdet is NaN in that case. The solution is always full-fp64-accurate via the residual refinement regardless.

Returns (solution, logdet, Some(RefinementOutcome)) when the fp32 path succeeded, or (solution, logdet, None) on the fp64 fallback. When need_logdet is false and the fp32 path succeeds, the logdet field is NaN.

iterative_refinement_cholesky_solve

Function iterative_refinement_cholesky_solve Copy item path

Function iterative_refinement_cholesky_solve