Module naive

Module naive 

Source
Expand description

Naive non-cooperative matmul without tiling that can be very fast on small matrices. Naive matmul kernel implementation

Each local unit will compute a single element of the output matrix.

Functionsยง

launch
launch_ref
Matrix multiplication using memory coalescing algorithm with custom cube dimensions