Skip to main content

Crate oxionnx_cuda

Crate oxionnx_cuda 

Source
Expand description

§oxionnx-cuda

CUDA-accelerated dispatch for ONNX ops via the OxiCUDA GPU stack.

This crate provides:

  • CudaContext — a wrapper around a CUDA device context + DNN handle, constructed lazily via CudaContext::try_new.
  • CudaError — error type returned by the CUDA dispatch layer.
  • try_cuda_dispatch — the top-level dispatch function called from oxionnx::session::run_sequential_inner when the cuda feature is enabled.

§Dispatch flow

CUDA (highest priority)
  └─ try_cuda_dispatch → Ok(Some(results))   ← GPU handled it
     └─ Ok(None)                              ← fall back to wgpu / CPU
wgpu GPU dispatch
CPU dispatch

§Graceful degradation

On any CUDA error during dispatch, the function returns Err(...) which the caller maps to OnnxError::Internal. If CUDA is not available at session build time, CudaContext::try_new() returns None and no CUDA dispatch is attempted.

Re-exports§

pub use context::CudaContext;
pub use error::CudaDispatchError as CudaError;

Modules§

context
CUDA context wrapper for oxionnx-cuda.
conv
CUDA-accelerated 2-D convolution dispatch.
elementwise
CUDA-accelerated elementwise operator dispatch.
error
Error types for CUDA-accelerated ONNX dispatch.
matmul
CUDA-accelerated MatMul / Gemm dispatch.
reduce
CUDA-accelerated ReduceSum / ReduceMax dispatch.
softmax
CUDA-accelerated Softmax dispatch.

Functions§

try_cuda_dispatch
Attempt to dispatch a single ONNX node to the CUDA backend.