pub struct Swizzle { /* private fields */ }Expand description
Swizzling strategy for a buffer. See the following docs from cutlass:
0bxxxxxxxxxxxxxxxYYYxxxxxxxZZZxxxx ^–^ MBase is the number of least-sig bits to keep constant ^-^ ^-^ BBits is the number of bits in the mask ^———^ SShift is the distance to shift the YYY mask (pos shifts YYY to the right, neg shifts YYY to the left)
§Example
Given: 0bxxxxxxxxxxxxxxxxYYxxxxxxxxxZZxxx the result is: 0bxxxxxxxxxxxxxxxxYYxxxxxxxxxAAxxx where AA = ZZ xor YY
Some newer features, as well as cutlass in places, use a different terminology of span and
atom. For shared memory swizzle specifically, the parameters map as follows:
bits=log2(span / atom), or the number of atoms within one span, converted to address bitsbase=log2(atom), the size of the atom, converted to address bitsshift=log2(all_banks_bytes / atom), or the total number of atoms in all 32 shared memory banks, converted to address bits
For example:
- 32-byte span with a 16-byte atom =
[1, 4, 3] - 128-byte span with a 32-byte atom =
[3, 5, 2]
Implementations§
Source§impl Swizzle
impl Swizzle
Sourcepub fn new(bits: u32, base: u32, shift: i32) -> Self
pub fn new(bits: u32, base: u32, shift: i32) -> Self
Create a new swizzle with comptime parameters
Sourcepub fn apply(&self, offset: u32, type_size: u32) -> u32
pub fn apply(&self, offset: u32, type_size: u32) -> u32
Apply the swizzle to a coordinate with a given item size. This is the size of the full type,
including line size. Use type_size helper for lines.
offset should be in terms of lines from the start of the buffer, and the buffer should be
aligned to repeats_after. This is to work around the fact we don’t currently support
retrieving the actual address of an offset.
If you’re using absolute/unlined indices, pass E::type_size() instead of the full line size.
Sourcepub fn repeats_after(&self) -> u32
pub fn repeats_after(&self) -> u32
After how many elements this pattern repeats. Can be used to align the buffer (i.e. smem) so offsets match addresses.
pub fn __expand_new( scope: &mut Scope, bits: u32, base: u32, shift: i32, ) -> <Self as CubeType>::ExpandType
pub fn __expand_none(scope: &mut Scope) -> <Self as CubeType>::ExpandType
pub fn __expand_apply( scope: &mut Scope, this: <Self as CubeType>::ExpandType, offset: <u32 as CubeType>::ExpandType, type_size: u32, ) -> <u32 as CubeType>::ExpandType
pub fn __expand_repeats_after( scope: &mut Scope, this: <Self as CubeType>::ExpandType, ) -> u32
Trait Implementations§
Source§impl CubeType for Swizzle
impl CubeType for Swizzle
type ExpandType = SwizzleExpand
Source§fn into_mut(scope: &mut Scope, expand: Self::ExpandType) -> Self::ExpandType
fn into_mut(scope: &mut Scope, expand: Self::ExpandType) -> Self::ExpandType
Source§impl LaunchArg for Swizzle
impl LaunchArg for Swizzle
Source§type RuntimeArg<'a, R: Runtime> = SwizzleLaunch<'a, R>
type RuntimeArg<'a, R: Runtime> = SwizzleLaunch<'a, R>
Source§type CompilationArg = SwizzleCompilationArg
type CompilationArg = SwizzleCompilationArg
fn compilation_arg<'a, R: Runtime>( runtime_arg: &Self::RuntimeArg<'a, R>, ) -> Self::CompilationArg
Source§fn expand(
arg: &Self::CompilationArg,
builder: &mut KernelBuilder,
) -> <Self as CubeType>::ExpandType
fn expand( arg: &Self::CompilationArg, builder: &mut KernelBuilder, ) -> <Self as CubeType>::ExpandType
Source§fn expand_output(
arg: &Self::CompilationArg,
builder: &mut KernelBuilder,
) -> <Self as CubeType>::ExpandType
fn expand_output( arg: &Self::CompilationArg, builder: &mut KernelBuilder, ) -> <Self as CubeType>::ExpandType
impl Copy for Swizzle
Auto Trait Implementations§
impl Freeze for Swizzle
impl RefUnwindSafe for Swizzle
impl Send for Swizzle
impl Sync for Swizzle
impl Unpin for Swizzle
impl UnwindSafe for Swizzle
Blanket Implementations§
§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
§unsafe fn clone_to_uninit(&self, dest: *mut u8)
unsafe fn clone_to_uninit(&self, dest: *mut u8)
clone_to_uninit)