Crate gam_gpu

Source

Re-exports§

pub use cpu_traits::MatrixLocation;
pub use device::GpuDeviceInfo;
pub use device_runtime::GpuRuntime;
pub use gpu_error::GpuError;
pub use memory::DeviceBuffer;
pub use memory::DeviceCsrMatrix;
pub use memory::DeviceMatrix;
pub use memory::DeviceVector;
pub use policy::GpuDispatchPolicy;
pub use policy::GpuMixedPrecisionPolicy;
pub use pool::balanced_partition;
pub use pool::scatter_batched;
pub use profile::GpuExecutionTelemetry;
pub use profile::KernelStat;
pub use profile::KernelStatsSnapshot;

Modules§

backend_probe: Shared CUDA backend-probe contract for every cudarc-backed module under src/gpu/*.
blas: Device BLAS surface for the cudarc-backed dense kernels.
calibration
cpu_traits
device
device_cache: Shared host-side scaffolding for every cudarc-backed module under src/gpu/* and src/solver/gpu/*.
device_runtime
driver: Shared CUDA driver presence/loading helpers used by every cuBLAS / cuSPARSE / cuSOLVER routing module.
gpu_error: Typed error for the src/gpu/* modules.
kernels: Domain-specific GPU kernels live in their owning algorithm crates.
linalg_dispatch: Automatic GPU dispatch shim for dense linear algebra hot kernels.
memory
numerics_device: Shared device-side probit numerics for NVRTC kernels.
numerics_host: Host-side scalar special functions shared by the CPU parity references of the GPU backends.
policy
pool: Multi-GPU device pool.
profile
solver: cuSOLVER-backed dense solver kernels for the GPU HAL.

Macros§

gpu_bail: return Err(GpuError::DriverCallFailed { reason: format!(...) }).
gpu_err: Build a GpuError::DriverCallFailed { reason: format!(...) } value.

Structs§

GpuDecision: A backend-selection decision for a single hot kernel.

Enums§

CudaBackendStatus
GpuEligibility: Joint eligibility state for a GPU kernel at the call site.
GpuKernel
GpuMode: Fail-closed GPU residency mode (issue #1017).
GpuPolicy: User-facing GPU backend policy.

Functions§

configure_global_policy: Configure the process-wide policy before solver kernels are selected. If a previous explicit configuration already set the policy, the first value wins so concurrent fits cannot race policy changes. Reads of global_policy() never claim the slot, so the very first explicit configuration always sticks even if dispatch code observed the default Auto beforehand.
cuda_selected: True when direct solver GPU entry points should be attempted.
decide: Decide whether a GPU kernel may run. This is deliberately conservative: with no compiled vendor backend, auto returns CPU fallback and force returns an error at the call site through GpuDecision::require_supported.
global_policy
gpu_mode: Read the process-wide GPU residency mode. Defaults to GpuMode::Auto without claiming the slot, mirroring global_policy so an incidental read never locks the mode against a later explicit set_gpu_mode.
log_backend_inventory_once: Emit the roadmap-visible kernels at startup/debug time without affecting numerical execution. This keeps backend coverage auditable as real device kernels are added incrementally.
set_gpu_mode: Configure the process-wide GPU residency mode. First-writer-wins so concurrent fits cannot race the contract; a redundant late call is ignored.
try_cholesky_batched_lower_inplace
try_cholesky_lower_inplace
try_fast_ab
try_fast_ab_broadcast_b_batched
try_fast_abt_strided_batched
try_fast_atb_on_ordinal
try_fast_atv
try_fast_av
try_solve_lower_triangular_matrix
try_solve_upper_triangular_matrix

Crate gam_gpu

Crate gam_gpu Copy item path

Re-exports§

Modules§

Macros§

Structs§

Enums§

Functions§

Crate gam_gpu