oxicuda-memory
Type-safe GPU memory management with Rust ownership semantics.
Part of the OxiCUDA project.
Overview
oxicuda-memory provides safe, RAII-based wrappers around CUDA memory
allocation and transfer operations. Every buffer type owns its GPU (or
pinned-host) allocation and automatically frees it on Drop, preventing
leaks without requiring manual cleanup.
The crate enforces compile-time type safety through generics (T: Copy)
and validates sizes at runtime, returning CudaError::InvalidValue for
mismatches rather than panicking. Drop implementations log errors via
tracing::warn instead of panicking, ensuring safe teardown even when
the CUDA context has already been destroyed.
The copy module provides freestanding transfer functions that mirror the
CUDA driver cuMemcpy* family (copy_htod, copy_dtoh, copy_dtod)
with both synchronous and async variants. For convenience, DeviceBuffer
also exposes methods like copy_from_host() and copy_to_host() directly.
Modules
| Module | Description |
|---|---|
device_buffer |
DeviceBuffer<T> (VRAM) and DeviceSlice<T> (sub-range) |
host_buffer |
PinnedBuffer<T> -- page-locked host memory for fast DMA |
unified |
UnifiedBuffer<T> -- CUDA managed memory (host+device) |
zero_copy |
MappedBuffer<T> -- zero-copy host-mapped memory |
copy |
Freestanding copy_htod, copy_dtoh, copy_dtod helpers |
pool |
MemoryPool -- stream-ordered allocation (behind pool feature) |
Quick Start
use *;
use *;
init?;
let dev = get?;
let _ctx = new?;
// Allocate a device buffer and upload host data.
let host_data = vec!;
let mut gpu_buf = from_slice?;
// Download results back to the host.
let mut result = vec!;
gpu_buf.copy_to_host?;
# Ok::
Buffer Types
| Type | Location | Description |
|---|---|---|
DeviceBuffer<T> |
Device (VRAM) | Primary GPU-side buffer |
DeviceSlice<T> |
Device (VRAM) | Borrowed sub-range of a device buffer |
PinnedBuffer<T> |
Host (pinned) | Page-locked host memory for fast DMA |
UnifiedBuffer<T> |
Unified/managed | Accessible from both host and device |
MappedBuffer<T> |
Host-mapped | Zero-copy host-mapped device-accessible memory |
MemoryPool |
Device pool | Stream-ordered allocation (CUDA 11.2+) |
Features
| Feature | Description |
|---|---|
pool |
Enable stream-ordered memory pool (CUDA 11.2+) |
gpu-tests |
Enable integration tests that require a real GPU |
Platform Support
| Platform | Status |
|---|---|
| Linux | Full support (NVIDIA driver 525+) |
| Windows | Full support (NVIDIA driver 525+) |
| macOS | Compile only (UnsupportedPlatform at runtime) |
License
Apache-2.0 -- (C) 2026 COOLJAPAN OU (Team KitaSan)