Skip to main content

Module solver

Module solver 

Source
Expand description

cuSOLVER-backed dense solver kernels for the GPU HAL.

This module owns CUDA solver functionality that is shared by GPU linear algebra dispatch and higher-level solver code. CPU solves do not live behind these entry points: unavailable CUDA support is reported as an error.

Structs§

RefinementOutcome
Outcome reported by iterative_refinement_cholesky_solve.

Functions§

check_deferred_potrf_info
Download the POTRF deferred info scalar and return an error if non-zero.
check_deferred_potrs_info
Download the POTRS deferred info scalar and return an error if non-zero.
cholesky_logdet_from_col_major
cholesky_lower_gpu
cholesky_solve_gpu
cholesky_solve_only_gpu
Solution-only mixed-precision solve: like cholesky_solve_gpu but skips the redundant fp64 POTRF when the fp32 + refinement path succeeds, since the caller does not consume the log-determinant. This is the path that delivers the full mixed-precision speedup (expensive O(p³) factor stays fp32) for the PIRLS Newton direction solve, where the logdet is discarded. The solution is full fp64 accuracy via iterative refinement.
context_and_stream
iterative_refinement_cholesky_solve
Solve A x = b with fp32 Cholesky factorization + fp64-residual iterative refinement, automatically falling back to fp64 when the policy rejects the attempt or when the fp32 path fails / diverges.
pinned_htod
potrf_in_place
potrf_in_place_reuse
POTRF factorization using pre-allocated workspace and info buffers.
potrf_query_lwork
Query the cuSOLVER POTRF workspace size for a p×p matrix.
potrs_in_place
potrs_in_place_reuse
POTRS triangular solve using a pre-allocated info buffer.
solver_backend_status