pub struct Swizzle { /* private fields */ }Expand description
Swizzling strategy for a buffer. See the following docs from cutlass:
0bxxxxxxxxxxxxxxxYYYxxxxxxxZZZxxxx ^–^ MBase is the number of least-sig bits to keep constant ^-^ ^-^ BBits is the number of bits in the mask ^———^ SShift is the distance to shift the YYY mask (pos shifts YYY to the right, neg shifts YYY to the left)
§Example
Given: 0bxxxxxxxxxxxxxxxxYYxxxxxxxxxZZxxx the result is: 0bxxxxxxxxxxxxxxxxYYxxxxxxxxxAAxxx where AA = ZZ xor YY
Some newer features, as well as cutlass in places, use a different terminology of span and
atom. For shared memory swizzle specifically, the parameters map as follows:
bits=log2(span / atom), or the number of atoms within one span, converted to address bitsbase=log2(atom), the size of the atom, converted to address bitsshift=log2(all_banks_bytes / atom), or the total number of atoms in all 32 shared memory banks, converted to address bits
For example:
- 32-byte span with a 16-byte atom =
[1, 4, 3] - 128-byte span with a 32-byte atom =
[3, 5, 2]
Implementations§
Source§impl Swizzle
impl Swizzle
Sourcepub fn new(bits: u32, base: u32, shift: i32) -> Self
pub fn new(bits: u32, base: u32, shift: i32) -> Self
Create a new swizzle with comptime parameters
Sourcepub fn apply(&self, offset: u32, type_size: u32) -> u32
pub fn apply(&self, offset: u32, type_size: u32) -> u32
Apply the swizzle to a coordinate with a given item size. This is the size of the full type,
including line size. Use type_size helper for lines.
offset should be in terms of lines from the start of the buffer, and the buffer should be
aligned to repeats_after. This is to work around the fact we don’t currently support
retrieving the actual address of an offset.
If you’re using absolute/unlined indices, pass E::type_size() instead of the full line size.
Sourcepub fn repeats_after(&self) -> u32
pub fn repeats_after(&self) -> u32
After how many elements this pattern repeats. Can be used to align the buffer (i.e. smem) so offsets match addresses.