pub struct CudaMemPool { /* private fields */ }Expand description
Safe wrapper around a CUDA memory pool.
The pool amortizes allocation overhead by maintaining a reservoir of device memory. Allocations are fast sub-allocations from this reservoir, and frees return memory to the pool rather than the OS (until the release threshold is exceeded).
§Thread Safety
This type uses internal locking to serialize host-side calls to CUDA driver APIs.
cuMemAllocFromPoolAsync is not host-thread reentrant, so concurrent calls from
multiple threads must be serialized. The GPU-side operations remain asynchronous
and stream-ordered.
Use CudaMemPoolBuilder for configurable pool creation with pre-allocation.
Implementations§
Source§impl CudaMemPool
impl CudaMemPool
Sourcepub fn builder(
context: Arc<CudaContext>,
reserve_size: usize,
) -> CudaMemPoolBuilder
pub fn builder( context: Arc<CudaContext>, reserve_size: usize, ) -> CudaMemPoolBuilder
Create a builder for a new CUDA memory pool.
§Arguments
context- CUDA context for the devicereserve_size- Number of bytes to pre-allocate to warm the pool
Sourcepub fn alloc_async(&self, size: usize, stream: &CudaStream) -> Result<u64>
pub fn alloc_async(&self, size: usize, stream: &CudaStream) -> Result<u64>
Allocate memory from the pool asynchronously.
This is the safe variant that takes a &CudaStream reference, ensuring
the stream is valid for the duration of the call.
The allocation is stream-ordered; the memory is available for use after all preceding operations on the stream complete.
§Host Serialization
This method acquires an internal mutex because cuMemAllocFromPoolAsync
is not host-thread reentrant. The allocation itself is stream-ordered on
the GPU side.
§Arguments
size- Size in bytes to allocatestream- CUDA stream for async ordering
§Returns
Device pointer to the allocated memory
Sourcepub unsafe fn alloc_async_raw(
&self,
size: usize,
stream: CUstream,
) -> Result<u64>
pub unsafe fn alloc_async_raw( &self, size: usize, stream: CUstream, ) -> Result<u64>
Allocate memory from the pool asynchronously (raw stream handle variant).
This is the unsafe variant for use when you have a raw CUstream handle
from sources other than cudarc’s CudaStream.
§Host Serialization
This method acquires an internal mutex because cuMemAllocFromPoolAsync
is not host-thread reentrant.
§Arguments
size- Size in bytes to allocatestream- Raw CUDA stream handle for async ordering
§Returns
Device pointer to the allocated memory
§Safety
The caller must ensure that stream is a valid CUDA stream handle that
will remain valid for the duration of this call.
Sourcepub fn free_async(&self, ptr: u64, stream: &CudaStream) -> Result<()>
pub fn free_async(&self, ptr: u64, stream: &CudaStream) -> Result<()>
Free memory back to the pool asynchronously.
This is the safe variant that takes a &CudaStream reference.
The memory is returned to the pool’s reservoir (not the OS) and can be reused by subsequent allocations. The free is stream-ordered.
§Arguments
ptr- Device pointer previously allocated from this poolstream- CUDA stream for async ordering
Sourcepub unsafe fn free_async_raw(&self, ptr: u64, stream: CUstream) -> Result<()>
pub unsafe fn free_async_raw(&self, ptr: u64, stream: CUstream) -> Result<()>
Free memory back to the pool asynchronously (raw stream handle variant).
This is the unsafe variant for use when you have a raw CUstream handle.
The memory is returned to the pool’s reservoir (not the OS) and can be reused by subsequent allocations. The free is stream-ordered.
§Arguments
ptr- Device pointer previously allocated from this poolstream- Raw CUDA stream handle for async ordering
§Safety
The caller must ensure that:
ptris a valid device pointer previously allocated from this poolstreamis a valid CUDA stream handle