Skip to main content

Module pool

Module pool 

Source
Expand description

Stream-ordered memory pool for efficient async allocation.

Requires CUDA 11.2+ driver. Gated behind the pool feature.

Stream-ordered memory pools allow allocation and deallocation to be ordered relative to other operations on a CUDA stream, enabling the driver to reuse memory more aggressively and avoid synchronisation barriers that would otherwise be needed for conventional cuMemAlloc / cuMemFree calls.

§Implementation note

This implementation provides a practical fallback pool that reuses freed allocations by size and uses cuMemAlloc_v2 / cuMemFree_v2 under the hood. It keeps the same API surface as a stream-ordered pool, but does not yet expose native CUDA mempool handles.

§API

let pool = MemoryPool::new(device)?;
let buf = PooledBuffer::<f32>::alloc_async(&pool, 1024, &stream)?;
// … use buf in kernels on `stream` …
// buf is freed asynchronously when dropped (enqueued on the pool's stream).

Structs§

MemoryPool
A stream-ordered memory pool (CUDA 11.2+).
NativeMemoryPool
Thin wrapper around the CUDA driver’s stream-ordered memory pool (cuMemPoolCreate / cuMemPoolDestroy).
NativeMemoryPoolProps
Configuration for a NativeMemoryPool.
PoolStats
A stream-ordered memory pool (CUDA 11.2+).
PooledBuffer
A device buffer allocated from a MemoryPool.