pub enum WmmaInstruction<D: Dialect> {
Fill {
frag: Variable<D>,
value: Variable<D>,
},
Load {
frag: Variable<D>,
value: Variable<D>,
offset: Variable<D>,
stride: Variable<D>,
layout: Option<FragmentLayout<D>>,
},
Execute {
frag_a: Variable<D>,
frag_b: Variable<D>,
frag_c: Variable<D>,
frag_d: Variable<D>,
warp_size: u32,
},
ExecuteManual {
shape: MmaShape<D>,
frag_a: Variable<D>,
frag_b: Variable<D>,
frag_c: Variable<D>,
frag_d: Variable<D>,
},
ExecuteScaled {
shape: MmaShape<D>,
frag_a: Variable<D>,
frag_b: Variable<D>,
frag_c: Variable<D>,
frag_d: Variable<D>,
scales_a: Variable<D>,
scales_b: Variable<D>,
scales_factor: u32,
},
Store {
output: Variable<D>,
frag: Variable<D>,
stride: Variable<D>,
offset: Variable<D>,
layout: FragmentLayout<D>,
},
LdMatrix {
output: Variable<D>,
buffer: Variable<D>,
offset: Variable<D>,
line_size: Option<u32>,
factor: u32,
transpose: bool,
},
StMatrix {
registers: Variable<D>,
buffer: Variable<D>,
offset: Variable<D>,
line_size: Option<u32>,
factor: u32,
transpose: bool,
},
Cast {
input: Variable<D>,
output: Variable<D>,
},
}Expand description
Warp Matrix-Multiply and Accumulate Instruction.
Variants§
Fill
Fill the fragment with the value.
Load
Load the value into the fragment given the stride.
Fields
layout: Option<FragmentLayout<D>>Execute
Executes D=A*B+C;
For implementing a matmul, D=C : C+=A*B
Fields
ExecuteManual
Executes D=A*B+C using manually managed registers;
For implementing a matmul, D=C : C+=A*B
Takes a sequence of registers for the inputs, and returns an array of registers for the
output. PTX requires output registers to be non-overlapping, so we use array to ensure that
and handle potentially destructuring it internally.
Fields
ExecuteScaled
Executes D=A*B+C using manually managed registers;
For implementing a matmul, D=C : C+=A*B
Takes a sequence of registers for the inputs, and returns an array of registers for the
output. PTX requires output registers to be non-overlapping, so we use array to ensure that
and handle potentially destructuring it internally.
Fields
Store
Store the fragment in an output variable following the stride and the layout.
Fields
layout: FragmentLayout<D>LdMatrix
Load a part of a fragment into registers, either 1, 2, or 4 at once.
Fields
StMatrix
Store a part of a fragment into smem, either 1, 2, or 4 at once.
Fields
Cast
Cast
Trait Implementations§
Source§impl<D: Clone + Dialect> Clone for WmmaInstruction<D>
impl<D: Clone + Dialect> Clone for WmmaInstruction<D>
Source§fn clone(&self) -> WmmaInstruction<D>
fn clone(&self) -> WmmaInstruction<D>
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl<D: Dialect> Display for WmmaInstruction<D>
impl<D: Dialect> Display for WmmaInstruction<D>
impl<D: Dialect> StructuralPartialEq for WmmaInstruction<D>
Auto Trait Implementations§
impl<D> Freeze for WmmaInstruction<D>
impl<D> RefUnwindSafe for WmmaInstruction<D>where
D: RefUnwindSafe,
impl<D> Send for WmmaInstruction<D>
impl<D> Sync for WmmaInstruction<D>
impl<D> Unpin for WmmaInstruction<D>where
D: Unpin,
impl<D> UnwindSafe for WmmaInstruction<D>where
D: UnwindSafe,
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more