oxicuda 0.1.0

OxiCUDA - Pure Rust CUDA replacement for the COOLJAPAN ecosystem (95% performance target)
Documentation
oxicuda-0.1.0 has been yanked.

oxicuda

Pure Rust CUDA replacement for the COOLJAPAN ecosystem.

Part of the OxiCUDA project.

Overview

oxicuda is the umbrella crate that re-exports all OxiCUDA sub-crates behind feature flags. It provides a single dependency entry point for applications that need GPU compute capabilities without installing the CUDA Toolkit -- libcuda.so (or nvcuda.dll) is loaded dynamically at runtime.

The core crates (driver, memory, launch) are enabled by default. Higher-level libraries -- BLAS, DNN, FFT, sparse, solver, and random number generation -- are opt-in via feature flags. Enable full to get everything.

A prelude module and init() function provide convenient imports and one-call CUDA driver initialization.

Architecture

                    oxicuda (umbrella)
     +---------+---------+---------+---------+
     |         |         |         |         |
  driver   memory    launch      ptx    autotune
     |         |         |         |         |
     +----+----+---------+---------+---------+
          |
   +------+------+------+------+------+
   |      |      |      |      |      |
  blas   dnn    fft   sparse solver  rand

Quick Start

use oxicuda::prelude::*;

fn main() -> CudaResult<()> {
    oxicuda::init()?;

    let device = Device::get(0)?;
    let ctx = std::sync::Arc::new(Context::new(&device)?);
    let stream = Stream::new(&ctx)?;

    let mut buf = DeviceBuffer::<f32>::alloc(1024)?;
    let host = vec![1.0f32; 1024];
    buf.copy_from_host(&host)?;

    Ok(())
}

Feature Flags

Feature Description Default
driver CUDA driver API wrapper Yes
memory GPU memory management Yes
launch Kernel launch infrastructure Yes
ptx PTX code generation DSL No
autotune Autotuner engine (implies ptx) No
blas cuBLAS equivalent No
dnn cuDNN equivalent (implies blas) No
fft cuFFT equivalent No
sparse cuSPARSE equivalent No
solver cuSOLVER equivalent No
rand cuRAND equivalent No
pool Stream-ordered memory pool No
full Enable all optional features No

Sub-crates

Crate Volume Description
oxicuda-driver Vol.1 CUDA driver API bindings
oxicuda-memory Vol.1 Device, pinned, unified memory
oxicuda-launch Vol.1 Kernel launch and grid configuration
oxicuda-ptx Vol.2 PTX code generation DSL
oxicuda-autotune Vol.2 Autotuner for kernel parameters
oxicuda-blas Vol.3 Dense linear algebra (GEMM, etc.)
oxicuda-dnn Vol.4 Deep learning primitives
oxicuda-fft Vol.5 Fast Fourier Transform
oxicuda-sparse Vol.5 Sparse matrix operations
oxicuda-solver Vol.5 Matrix decompositions and solvers
oxicuda-rand Vol.5 Random number generation

License

Apache-2.0 -- (C) 2026 COOLJAPAN OU (Team KitaSan)