pub fn tma_store_4d<E: CubePrimitive>(
src: &Slice<Line<E>>,
dst: &mut TensorMap<E>,
w: i32,
z: i32,
y: i32,
x: i32,
)Expand description
Copy a tile from a shared memory src to a global memory dst, with the provided
offsets. Should be combined with [memcpy_async_tensor_commit] and
[memcpy_async_tensor_wait_read].