Expand description
§oxionnx-cuda
CUDA-accelerated dispatch for ONNX ops via the OxiCUDA GPU stack.
This crate provides:
CudaContext— a wrapper around a CUDA device context + DNN handle, constructed lazily viaCudaContext::try_new.CudaError— error type returned by the CUDA dispatch layer.try_cuda_dispatch— the top-level dispatch function called fromoxionnx::session::run_sequential_innerwhen thecudafeature is enabled.
§Dispatch flow
CUDA (highest priority)
└─ try_cuda_dispatch → Ok(Some(results)) ← GPU handled it
└─ Ok(None) ← fall back to wgpu / CPU
wgpu GPU dispatch
CPU dispatch§Graceful degradation
On any CUDA error during dispatch, the function returns Err(...) which
the caller maps to OnnxError::Internal. If CUDA is not available at
session build time, CudaContext::try_new() returns None and no CUDA
dispatch is attempted.
Re-exports§
pub use context::CudaContext;pub use error::CudaDispatchError as CudaError;
Modules§
- context
- CUDA context wrapper for oxionnx-cuda.
- conv
- CUDA-accelerated 2-D convolution dispatch.
- elementwise
- CUDA-accelerated elementwise operator dispatch.
- error
- Error types for CUDA-accelerated ONNX dispatch.
- matmul
- CUDA-accelerated MatMul / Gemm dispatch.
- reduce
- CUDA-accelerated ReduceSum / ReduceMax dispatch.
- softmax
- CUDA-accelerated Softmax dispatch.
Functions§
- try_
cuda_ dispatch - Attempt to dispatch a single ONNX node to the CUDA backend.