Re-exports§
pub use cpu_traits::MatrixLocation;pub use device::GpuDeviceInfo;pub use device_runtime::GpuRuntime;pub use gpu_error::GpuError;pub use memory::DeviceBuffer;pub use memory::DeviceCsrMatrix;pub use memory::DeviceMatrix;pub use memory::DeviceVector;pub use policy::GpuDispatchPolicy;pub use policy::GpuMixedPrecisionPolicy;pub use pool::balanced_partition;pub use pool::scatter_batched;pub use profile::GpuExecutionTelemetry;pub use profile::KernelStat;pub use profile::KernelStatsSnapshot;
Modules§
- backend_
probe - Shared CUDA backend-probe contract for every cudarc-backed module under
src/gpu/*. - blas
- Device BLAS surface for the cudarc-backed dense kernels.
- calibration
- cpu_
traits - device
- device_
cache - Shared host-side scaffolding for every cudarc-backed module under
src/gpu/*andsrc/solver/gpu/*. - device_
runtime - driver
- Shared CUDA driver presence/loading helpers used by every cuBLAS / cuSPARSE / cuSOLVER routing module.
- gpu_
error - Typed error for the
src/gpu/*modules. - kernels
- Domain-specific GPU kernels live in their owning algorithm crates.
- linalg_
dispatch - Automatic GPU dispatch shim for dense linear algebra hot kernels.
- memory
- numerics_
device - Shared device-side probit numerics for NVRTC kernels.
- numerics_
host - Host-side scalar special functions shared by the CPU parity references of the GPU backends.
- policy
- pool
- Multi-GPU device pool.
- profile
- solver
- cuSOLVER-backed dense solver kernels for the GPU HAL.
Macros§
- gpu_
bail return Err(GpuError::DriverCallFailed { reason: format!(...) }).- gpu_err
- Build a
GpuError::DriverCallFailed { reason: format!(...) }value.
Structs§
- GpuDecision
- A backend-selection decision for a single hot kernel.
Enums§
- Cuda
Backend Status - GpuEligibility
- Joint eligibility state for a GPU kernel at the call site.
- GpuKernel
- GpuMode
- Fail-closed GPU residency mode (issue #1017).
- GpuPolicy
- User-facing GPU backend policy.
Functions§
- configure_
global_ policy - Configure the process-wide policy before solver kernels are selected.
If a previous explicit configuration already set the policy, the first
value wins so concurrent fits cannot race policy changes. Reads of
global_policy()never claim the slot, so the very first explicit configuration always sticks even if dispatch code observed the defaultAutobeforehand. - cuda_
selected - True when direct solver GPU entry points should be attempted.
- decide
- Decide whether a GPU kernel may run. This is deliberately conservative:
with no compiled vendor backend,
autoreturns CPU fallback andforcereturns an error at the call site throughGpuDecision::require_supported. - global_
policy - gpu_
mode - Read the process-wide GPU residency mode. Defaults to
GpuMode::Autowithout claiming the slot, mirroringglobal_policyso an incidental read never locks the mode against a later explicitset_gpu_mode. - log_
backend_ inventory_ once - Emit the roadmap-visible kernels at startup/debug time without affecting numerical execution. This keeps backend coverage auditable as real device kernels are added incrementally.
- set_
gpu_ mode - Configure the process-wide GPU residency mode. First-writer-wins so concurrent fits cannot race the contract; a redundant late call is ignored.
- try_
cholesky_ batched_ lower_ inplace - try_
cholesky_ lower_ inplace - try_
fast_ ab - try_
fast_ ab_ broadcast_ b_ batched - try_
fast_ abt_ strided_ batched - try_
fast_ atb_ on_ ordinal - try_
fast_ atv - try_
fast_ av - try_
solve_ lower_ triangular_ matrix - try_
solve_ upper_ triangular_ matrix