pub struct TmaEncodeTiledParams {
pub data_type: CuTensorMapDataType,
pub num_dims: u32,
pub global_dims: [u64; 5],
pub global_strides: [u64; 4],
pub box_dims: [u32; 5],
pub element_strides: [u32; 5],
pub interleave: CuTensorMapInterleave,
pub swizzle: CuTensorMapSwizzle,
pub l2_promotion: CuTensorMapL2Promotion,
pub oob_fill: CuTensorMapFloatOobFill,
}Expand description
All parameters required by cuTensorMapEncodeTiled.
Produced by TmaDescriptorBuilder. Pass these to the driver function
pointer available in
DriverApi::cu_tensor_map_encode_tiled.
§Dimension ordering
CUDA TMA uses column-major ordering for dimensions: global_dims[0]
is the innermost (fastest-varying) dimension. For a row-major matrix
of shape R × C, set global_dims[0] = C (cols) and
global_dims[1] = R (rows).
Fields§
§data_type: CuTensorMapDataTypeElement data type.
num_dims: u32Number of tensor dimensions (1–5).
global_dims: [u64; 5]Size of each global tensor dimension (innermost first).
global_strides: [u64; 4]Byte stride between elements in outer dimensions (innermost stride omitted).
box_dims: [u32; 5]Size of each tile dimension (must fit in shared memory).
element_strides: [u32; 5]Element stride within each tile dimension (typically all-ones).
interleave: CuTensorMapInterleaveInterleave mode.
swizzle: CuTensorMapSwizzleSwizzle mode.
l2_promotion: CuTensorMapL2PromotionL2 promotion hint.
oob_fill: CuTensorMapFloatOobFillOut-of-bounds fill mode for float elements.
Trait Implementations§
Source§impl Clone for TmaEncodeTiledParams
impl Clone for TmaEncodeTiledParams
Source§fn clone(&self) -> TmaEncodeTiledParams
fn clone(&self) -> TmaEncodeTiledParams
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more