Skip to main content

Module tensor_map

Module tensor_map 

Source
Expand description

Hopper Tensor Memory Accelerator (TMA) descriptors.

CUDA 12.0+ introduced cuTensorMapEncodeTiled / cuTensorMapEncodeIm2col to produce CUtensorMap descriptors that TMA instructions in kernels consume to asynchronously move multi-dimensional tiles between global and shared memory. This is a Hopper-only hardware feature (SM 9.0+), but the descriptor encoding itself is pure host code and works on any device.

See the TensorMap builder for a typed wrapper around cuTensorMapEncodeTiled.

Modules§

DataType
CUtensorMapDataType — element type encoding for TMA descriptors.
Interleave
CUtensorMapInterleave.
L2Promotion
CUtensorMapL2promotion — L2 prefetch hint.
OOBFill
CUtensorMapFloatOOBfill — out-of-bounds fill behavior.
Swizzle
CUtensorMapSwizzle.

Structs§

TensorMap
A 128-byte Hopper TMA descriptor. Pass to a kernel as a __grid_constant__ parameter of type CUtensorMap for use with TMA instructions.