cubecl_core::frontend

Module cmma

Source
Expand description

This module exposes cooperative matrix-multiply and accumulate operations.

Most of the functions are actually unsafe, since they mutate their input, even if they are passed as reference.

§Example

This is a basic 16x16x16 matrix multiplication example.

#[cube(launch)]
pub fn example(lhs: &Array<F16>, rhs: &Array<F16>, out: &mut Array<F32>) {
    let a = cmma::Matrix::<F16>::new(
        cmma::MatrixIdent::A,
        16,
        16,
        16,
        cmma::MatrixLayout::RowMajor,
    );
    let b = cmma::Matrix::<F16>::new(
        cmma::MatrixIdent::B,
        16,
        16,
        16,
        cmma::MatrixLayout::ColMajor,
    );
    let c = cmma::Matrix::<F32>::new(
        cmma::MatrixIdent::Accumulator,
        16,
        16,
        16,
        cmma::MatrixLayout::Undefined,
    );
    cmma::fill::<F32>(&c, F32::new(0.0));
    cmma::load::<F16>(&a, lhs.as_slice(), u32::new(16));
    cmma::load::<F16>(&b, rhs.as_slice(), u32::new(16));

    cmma::execute::<F16, F16, F32, F32>(&a, &b, &c, &c);

    cmma::store::<F32>(
        out.as_slice_mut(),
        &c,
        u32::new(16),
        cmma::MatrixLayout::RowMajor,
    );
}

Re-exports§

Modules§

Structs§

Functions§

  • Execute the matrix-multiply and accumulate operation on the given matrices.
  • Fill the matrix with the provided value.
  • Load the matrix with the provided array using the stride.
  • Load the matrix with the provided array using the stride with an explicit layout. Explicit layouts are required when loading accumulators.
  • Store the matrix in the given array following the given stride and layout.