pub struct MmaDefinition<A: CubeType, B: CubeType, CD: CubeType> { /* private fields */ }Expand description
Defines a matrix multiplication operation, including the input and output type, and the shape.
Implementations§
Source§impl<A: CubePrimitive, B: CubePrimitive, CD: CubePrimitive> MmaDefinition<A, B, CD>
impl<A: CubePrimitive, B: CubePrimitive, CD: CubePrimitive> MmaDefinition<A, B, CD>
Sourcepub fn new(m: u32, n: u32, k: u32) -> Self
pub fn new(m: u32, n: u32, k: u32) -> Self
Create a new matrix definition that is going to be used in the manual matrix-multiply and accumulate function.
You have to declare the shape used for the execution. The shape of the current matrix is determined using the MatrixIdent.
- MatrixIdent::A Shape => (M, K)
- MatrixIdent::B Shape => (K, N)
- MatrixIdent::Accumulator Shape => (M, N)
Not all shapes are supported, and the permitted shapes depend on the element type.
Layout for manual MMA is determined by the runtime and must be handled manually.
Use [line_layout] to check the correct data layout for each element.
Refer to nvidia documentation.
Sourcepub fn new_scaled<S: CubePrimitive>(
m: u32,
n: u32,
k: u32,
scale_factor: u32,
) -> Self
pub fn new_scaled<S: CubePrimitive>( m: u32, n: u32, k: u32, scale_factor: u32, ) -> Self
Create a new matrix definition that is going to be used in the manual matrix-multiply and accumulate function.
You have to declare the shape used for the execution. The shape of the current matrix is determined using the MatrixIdent.
- MatrixIdent::A Shape => (M, K)
- MatrixIdent::B Shape => (K, N)
- MatrixIdent::Accumulator Shape => (M, N)
Not all shapes are supported, and the permitted shapes depend on the element type.
Layout for manual MMA is determined by the runtime and must be handled manually.
Use [line_layout] to check the correct data layout for each element.
Refer to nvidia documentation.
Sourcepub fn num_elems(&self, ident: MatrixIdent) -> u32
pub fn num_elems(&self, ident: MatrixIdent) -> u32
Number of elements in the matrix
Sourcepub fn elems_per_lane(&self, ident: MatrixIdent) -> u32
pub fn elems_per_lane(&self, ident: MatrixIdent) -> u32
Returns the number of elements handled by each lane. Should be packed into Lines of size
line_size with [line_layout].
§Note
“Lane” here refers to the unit relative to a plane, to distinguish it from a unit relative to a cube.
Sourcepub fn lines_per_lane(&self, ident: MatrixIdent) -> u32
pub fn lines_per_lane(&self, ident: MatrixIdent) -> u32
Returns the number of lines of size line_size with layout line_layout per lane.
§Note
“Lane” here refers to the unit relative to a plane, to distinguish it from a unit relative to a cube.
Sourcepub fn line_layout(&self, ident: MatrixIdent) -> MatrixLayout
pub fn line_layout(&self, ident: MatrixIdent) -> MatrixLayout
The layout of each line in this matrix (row major or column major)
Sourcepub fn line_size(&self, ident: MatrixIdent) -> u32
pub fn line_size(&self, ident: MatrixIdent) -> u32
Number of elements in each line passed to the execute function
Sourcepub fn position_of_nth(
&self,
lane_id: u32,
elem_idx: u32,
ident: MatrixIdent,
) -> (u32, u32)
pub fn position_of_nth( &self, lane_id: u32, elem_idx: u32, ident: MatrixIdent, ) -> (u32, u32)
Returns the coordinates of the nth element handled by the lane_id
Each lane contains [elems_per_lane] elements in [line_size] chunks.
Returns (row_idx, col_idx)
§Note
“Lane” here refers to the unit relative to a plane, to distinguish it from a unit relative to a cube.
Sourcepub fn scales_index(&self, lane_id: u32, ident: MatrixIdent) -> u32
pub fn scales_index(&self, lane_id: u32, ident: MatrixIdent) -> u32
Index of the scales for this thread, along the non-major dimension of the matrix.
Each thread loads all scales in the major direction into a single Line.
Sourcepub fn scales_count(&self) -> u32
pub fn scales_count(&self) -> u32
Number of scales in each line (not the line size!). Line size may include padding bytes.
Sourcepub fn scales_line_size(&self) -> u32
pub fn scales_line_size(&self) -> u32
Line size for the scale factors. May be larger than the total number of scales.
Sourcepub fn execute(
&self,
registers_a: &Sequence<Line<A>>,
registers_b: &Sequence<Line<B>>,
registers_c: &Sequence<Line<CD>>,
) -> Array<Line<CD>> ⓘ
pub fn execute( &self, registers_a: &Sequence<Line<A>>, registers_b: &Sequence<Line<B>>, registers_c: &Sequence<Line<CD>>, ) -> Array<Line<CD>> ⓘ
Execute a low level mma operation with manually managed registers. Register layout
and index mapping can be retrieved from the [MatrixDefinition]
Sourcepub fn execute_scaled<S: CubePrimitive>(
&self,
registers_a: &Sequence<Line<A>>,
registers_b: &Sequence<Line<B>>,
registers_c: &Sequence<Line<CD>>,
scales_a: Line<S>,
scales_b: Line<S>,
) -> Array<Line<CD>> ⓘ
pub fn execute_scaled<S: CubePrimitive>( &self, registers_a: &Sequence<Line<A>>, registers_b: &Sequence<Line<B>>, registers_c: &Sequence<Line<CD>>, scales_a: Line<S>, scales_b: Line<S>, ) -> Array<Line<CD>> ⓘ
Execute a low level block scaled mma operation with manually managed registers. Register
layout and index mapping can be retrieved from the [MatrixDefinition]
pub fn __expand_new( scope: &mut Scope, m: u32, n: u32, k: u32, ) -> <Self as CubeType>::ExpandType
pub fn __expand_new_scaled<S: CubePrimitive>( scope: &mut Scope, m: u32, n: u32, k: u32, scale_factor: u32, ) -> <Self as CubeType>::ExpandType
pub fn __expand_num_elems( scope: &mut Scope, this: <Self as CubeType>::ExpandType, ident: MatrixIdent, ) -> u32
pub fn __expand_elems_per_lane( scope: &mut Scope, this: <Self as CubeType>::ExpandType, ident: MatrixIdent, ) -> u32
pub fn __expand_lines_per_lane( scope: &mut Scope, this: <Self as CubeType>::ExpandType, ident: MatrixIdent, ) -> u32
pub fn __expand_line_layout( scope: &mut Scope, this: <Self as CubeType>::ExpandType, ident: MatrixIdent, ) -> MatrixLayout
pub fn __expand_line_size( scope: &mut Scope, this: <Self as CubeType>::ExpandType, ident: MatrixIdent, ) -> u32
pub fn __expand_position_of_nth( scope: &mut Scope, this: <Self as CubeType>::ExpandType, lane_id: <u32 as CubeType>::ExpandType, elem_idx: <u32 as CubeType>::ExpandType, ident: MatrixIdent, ) -> <(u32, u32) as CubeType>::ExpandType
pub fn __expand_scales_index( scope: &mut Scope, this: <Self as CubeType>::ExpandType, lane_id: <u32 as CubeType>::ExpandType, ident: MatrixIdent, ) -> <u32 as CubeType>::ExpandType
pub fn __expand_scales_count( scope: &mut Scope, this: <Self as CubeType>::ExpandType, ) -> u32
pub fn __expand_scales_line_size( scope: &mut Scope, this: <Self as CubeType>::ExpandType, ) -> u32
pub fn __expand_execute( scope: &mut Scope, this: <Self as CubeType>::ExpandType, registers_a: <Sequence<Line<A>> as CubeType>::ExpandType, registers_b: <Sequence<Line<B>> as CubeType>::ExpandType, registers_c: <Sequence<Line<CD>> as CubeType>::ExpandType, ) -> <Array<Line<CD>> as CubeType>::ExpandType ⓘ
pub fn __expand_execute_scaled<S: CubePrimitive>( scope: &mut Scope, this: <Self as CubeType>::ExpandType, registers_a: <Sequence<Line<A>> as CubeType>::ExpandType, registers_b: <Sequence<Line<B>> as CubeType>::ExpandType, registers_c: <Sequence<Line<CD>> as CubeType>::ExpandType, scales_a: <Line<S> as CubeType>::ExpandType, scales_b: <Line<S> as CubeType>::ExpandType, ) -> <Array<Line<CD>> as CubeType>::ExpandType ⓘ
Trait Implementations§
Source§impl<A: Clone + CubeType, B: Clone + CubeType, CD: Clone + CubeType> Clone for MmaDefinition<A, B, CD>
impl<A: Clone + CubeType, B: Clone + CubeType, CD: Clone + CubeType> Clone for MmaDefinition<A, B, CD>
Source§fn clone(&self) -> MmaDefinition<A, B, CD>
fn clone(&self) -> MmaDefinition<A, B, CD>
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more