Skip to main content

Crate gam_gpu

Crate gam_gpu 

Source

Re-exports§

pub use cpu_traits::MatrixLocation;
pub use device::GpuDeviceInfo;
pub use device_runtime::GpuRuntime;
pub use gpu_error::GpuError;
pub use memory::DeviceBuffer;
pub use memory::DeviceCsrMatrix;
pub use memory::DeviceMatrix;
pub use memory::DeviceVector;
pub use policy::GpuDispatchPolicy;
pub use policy::GpuMixedPrecisionPolicy;
pub use pool::balanced_partition;
pub use pool::scatter_batched;
pub use profile::GpuExecutionTelemetry;
pub use profile::KernelStat;
pub use profile::KernelStatsSnapshot;

Modules§

backend_probe
Shared CUDA backend-probe contract for every cudarc-backed module under src/gpu/*.
blas
Device BLAS surface for the cudarc-backed dense kernels.
calibration
cpu_traits
device
device_cache
Shared host-side scaffolding for every cudarc-backed module under src/gpu/* and src/solver/gpu/*.
device_runtime
driver
Shared CUDA driver presence/loading helpers used by every cuBLAS / cuSPARSE / cuSOLVER routing module.
gpu_error
Typed error for the src/gpu/* modules.
kernels
Domain-specific GPU kernels live in their owning algorithm crates.
linalg_dispatch
Automatic GPU dispatch shim for dense linear algebra hot kernels.
memory
numerics_device
Shared device-side probit numerics for NVRTC kernels.
numerics_host
Host-side scalar special functions shared by the CPU parity references of the GPU backends.
policy
pool
Multi-GPU device pool.
profile
solver
cuSOLVER-backed dense solver kernels for the GPU HAL.

Macros§

gpu_bail
return Err(GpuError::DriverCallFailed { reason: format!(...) }).
gpu_err
Build a GpuError::DriverCallFailed { reason: format!(...) } value.

Structs§

GpuDecision
A backend-selection decision for a single hot kernel.

Enums§

CudaBackendStatus
GpuEligibility
Joint eligibility state for a GPU kernel at the call site.
GpuKernel
GpuMode
Fail-closed GPU residency mode (issue #1017).
GpuPolicy
User-facing GPU backend policy.

Functions§

configure_global_policy
Configure the process-wide policy before solver kernels are selected. If a previous explicit configuration already set the policy, the first value wins so concurrent fits cannot race policy changes. Reads of global_policy() never claim the slot, so the very first explicit configuration always sticks even if dispatch code observed the default Auto beforehand.
cuda_selected
True when direct solver GPU entry points should be attempted.
decide
Decide whether a GPU kernel may run. This is deliberately conservative: with no compiled vendor backend, auto returns CPU fallback and force returns an error at the call site through GpuDecision::require_supported.
global_policy
gpu_mode
Read the process-wide GPU residency mode. Defaults to GpuMode::Auto without claiming the slot, mirroring global_policy so an incidental read never locks the mode against a later explicit set_gpu_mode.
log_backend_inventory_once
Emit the roadmap-visible kernels at startup/debug time without affecting numerical execution. This keeps backend coverage auditable as real device kernels are added incrementally.
set_gpu_mode
Configure the process-wide GPU residency mode. First-writer-wins so concurrent fits cannot race the contract; a redundant late call is ignored.
try_cholesky_batched_lower_inplace
try_cholesky_lower_inplace
try_fast_ab
try_fast_ab_broadcast_b_batched
try_fast_abt_strided_batched
try_fast_atb_on_ordinal
try_fast_atv
try_fast_av
try_solve_lower_triangular_matrix
try_solve_upper_triangular_matrix