Crate oxionnx_cuda

Expand description

§oxionnx-cuda

CUDA-accelerated dispatch for ONNX ops via the OxiCUDA GPU stack.

This crate provides:

CudaContext — a wrapper around a CUDA device context + DNN handle, constructed lazily via CudaContext::try_new.
CudaError — error type returned by the CUDA dispatch layer.
try_cuda_dispatch — the top-level dispatch function called from oxionnx::session::run_sequential_inner when the cuda feature is enabled.

§Dispatch flow

CUDA (highest priority)
  └─ try_cuda_dispatch → Ok(Some(results))   ← GPU handled it
     └─ Ok(None)                              ← fall back to wgpu / CPU
wgpu GPU dispatch
CPU dispatch

§Graceful degradation

On any CUDA error during dispatch, the function returns Err(...) which the caller maps to OnnxError::Internal. If CUDA is not available at session build time, CudaContext::try_new() returns None and no CUDA dispatch is attempted.

Re-exports§

pub use context::CudaContext;
pub use error::CudaDispatchError as CudaError;

Modules§

context: CUDA context wrapper for oxionnx-cuda.
conv: CUDA-accelerated 2-D convolution dispatch.
elementwise: CUDA-accelerated elementwise operator dispatch.
error: Error types for CUDA-accelerated ONNX dispatch.
matmul: CUDA-accelerated MatMul / Gemm dispatch.
reduce: CUDA-accelerated ReduceSum / ReduceMax dispatch.
softmax: CUDA-accelerated Softmax dispatch.

Functions§

try_cuda_dispatch: Attempt to dispatch a single ONNX node to the CUDA backend.

Crate oxionnx_cuda

Crate oxionnx_cuda Copy item path

§oxionnx-cuda

§Dispatch flow

§Graceful degradation

Re-exports§

Modules§

Functions§

Crate oxionnx_cuda