Expand description
Device and host memory management.
Implements the CUDA Runtime memory API:
cudaMalloc/cudaFreecudaMallocHost/cudaFreeHost(pinned host memory)cudaMallocManaged(unified memory)cudaMallocPitch(pitched 2-D allocation)cudaMemcpy/cudaMemcpyAsynccudaMemset/cudaMemsetAsynccudaMemGetInfo
All memory addresses returned for device allocations are represented as
DevicePtr, a newtype around u64 that matches the driver API’s
CUdeviceptr.
Structs§
- Device
Ptr - Opaque CUDA device-memory address (mirrors
CUdeviceptr).
Enums§
- MemAttach
Flags - Flags for
cudaMallocManaged. - Memcpy
Kind - Direction of a
cudaMemcpytransfer.
Functions§
- free
- Free device memory previously allocated with
malloc. - free_
host ⚠ - Free page-locked host memory previously allocated with
malloc_host. - malloc
- Allocate
sizebytes of device memory. - malloc_
host - Allocate
sizebytes of pinned (page-locked) host memory. - malloc_
managed - Allocate unified managed memory accessible from both CPU and GPU.
- malloc_
pitch - Allocate pitched device memory for 2-D arrays.
- mem_
get_ info - Returns
(free_bytes, total_bytes)for the current device’s global memory. - memcpy⚠
- Synchronously copy
countbytes between memory regions. - memcpy_
async ⚠ - Asynchronously copy
countbytes onstream. - memcpy_
d2d - Copy between two device allocations.
- memcpy_
d2h - Copy device memory to a host slice.
- memcpy_
h2d - Copy a slice of host data to a device allocation.
- memset
- Set
countbytes of device memory starting atptrtovalue. - memset32
- Set device memory to 32-bit value pattern.