pub struct Swizzle { /* private fields */ }Expand description
Swizzling strategy for a buffer. See the following docs from cutlass:
0bxxxxxxxxxxxxxxxYYYxxxxxxxZZZxxxx
^–^ MBase is the number of least-sig bits to keep constant
^-^ ^-^ BBits is the number of bits in the mask
^———^ SShift is the distance to shift the YYY mask
(pos shifts YYY to the right, neg shifts YYY to the left)
§Example
Given: 0bxxxxxxxxxxxxxxxxYYxxxxxxxxxZZxxx the result is: 0bxxxxxxxxxxxxxxxxYYxxxxxxxxxAAxxx where AA = ZZ xor YY
Some newer features, as well as cutlass in places, use a different terminology of span and
atom. For shared memory swizzle specifically, the parameters map as follows:
bits=log2(span / atom), or the number of atoms within one span, converted to address bitsbase=log2(atom), the size of the atom, converted to address bitsshift=log2(all_banks_bytes / atom), or the total number of atoms in all 32 shared memory banks, converted to address bits
For example:
- 32-byte span with a 16-byte atom =
[1, 4, 3] - 128-byte span with a 32-byte atom =
[3, 5, 2]
Implementations§
Source§impl Swizzle
impl Swizzle
Sourcepub fn new(bits: u32, base: u32, shift: i32) -> Self
pub fn new(bits: u32, base: u32, shift: i32) -> Self
Create a new swizzle with comptime parameters
Sourcepub fn apply(&self, offset: u32, type_size: usize) -> u32
pub fn apply(&self, offset: u32, type_size: usize) -> u32
Apply the swizzle to a coordinate with a given item size. This is the size of the full type,
including line size. Use type_size helper for lines.
offset should be in terms of lines from the start of the buffer, and the buffer should be
aligned to repeats_after. This is to work around the fact we don’t currently support
retrieving the actual address of an offset.
If you’re using absolute/unlined indices, pass E::type_size() instead of the full line size.
Sourcepub fn repeats_after(&self) -> u32
pub fn repeats_after(&self) -> u32
After how many elements this pattern repeats. Can be used to align the buffer (i.e. smem) so offsets match addresses.
pub fn __expand_new( scope: &mut Scope, bits: u32, base: u32, shift: i32, ) -> <Self as CubeType>::ExpandType
pub fn __expand_none(scope: &mut Scope) -> <Self as CubeType>::ExpandType
pub fn __expand_apply( scope: &mut Scope, this: <Self as CubeType>::ExpandType, offset: <u32 as CubeType>::ExpandType, type_size: usize, ) -> <u32 as CubeType>::ExpandType
pub fn __expand_repeats_after( scope: &mut Scope, this: <Self as CubeType>::ExpandType, ) -> u32
Trait Implementations§
Source§impl CubeType for Swizzle
impl CubeType for Swizzle
type ExpandType = SwizzleExpand
Source§fn into_mut(scope: &mut Scope, expand: Self::ExpandType) -> Self::ExpandType
fn into_mut(scope: &mut Scope, expand: Self::ExpandType) -> Self::ExpandType
Source§impl LaunchArg for Swizzle
impl LaunchArg for Swizzle
Source§type RuntimeArg<'a, R: Runtime> = SwizzleLaunch<'a, R>
type RuntimeArg<'a, R: Runtime> = SwizzleLaunch<'a, R>
Source§type CompilationArg = SwizzleCompilationArg
type CompilationArg = SwizzleCompilationArg
fn compilation_arg<'a, R: Runtime>( runtime_arg: &Self::RuntimeArg<'a, R>, ) -> Self::CompilationArg
Source§fn expand(
arg: &Self::CompilationArg,
builder: &mut KernelBuilder,
) -> <Self as CubeType>::ExpandType
fn expand( arg: &Self::CompilationArg, builder: &mut KernelBuilder, ) -> <Self as CubeType>::ExpandType
KernelBuilder.Source§fn expand_output(
arg: &Self::CompilationArg,
builder: &mut KernelBuilder,
) -> <Self as CubeType>::ExpandType
fn expand_output( arg: &Self::CompilationArg, builder: &mut KernelBuilder, ) -> <Self as CubeType>::ExpandType
KernelBuilder.