Skip to main content

Module memory

Module memory 

Source
Expand description

Device and host memory management.

Implements the CUDA Runtime memory API:

  • cudaMalloc / cudaFree
  • cudaMallocHost / cudaFreeHost (pinned host memory)
  • cudaMallocManaged (unified memory)
  • cudaMallocPitch (pitched 2-D allocation)
  • cudaMemcpy / cudaMemcpyAsync
  • cudaMemset / cudaMemsetAsync
  • cudaMemGetInfo

All memory addresses returned for device allocations are represented as DevicePtr, a newtype around u64 that matches the driver API’s CUdeviceptr.

Structs§

DevicePtr
Opaque CUDA device-memory address (mirrors CUdeviceptr).

Enums§

MemAttachFlags
Flags for cudaMallocManaged.
MemcpyKind
Direction of a cudaMemcpy transfer.

Functions§

free
Free device memory previously allocated with malloc.
free_host
Free page-locked host memory previously allocated with malloc_host.
malloc
Allocate size bytes of device memory.
malloc_host
Allocate size bytes of pinned (page-locked) host memory.
malloc_managed
Allocate unified managed memory accessible from both CPU and GPU.
malloc_pitch
Allocate pitched device memory for 2-D arrays.
mem_get_info
Returns (free_bytes, total_bytes) for the current device’s global memory.
memcpy
Synchronously copy count bytes between memory regions.
memcpy_async
Asynchronously copy count bytes on stream.
memcpy_d2d
Copy between two device allocations.
memcpy_d2h
Copy device memory to a host slice.
memcpy_h2d
Copy a slice of host data to a device allocation.
memset
Set count bytes of device memory starting at ptr to value.
memset32
Set device memory to 32-bit value pattern.