microgemm
General matrix multiplication with custom configuration in Rust.
Supports no_std and no_alloc environments.
The implementation is based on the BLIS microkernel approach.
Content
Install
Usage
The Kernel trait is the main abstraction of microgemm.
You can implement it yourself or use kernels that are already provided out of the box.
gemm
use ;
Also see no_alloc example for use without Vec.
Implemented Kernels
| Name | Scalar Types | Target |
|---|---|---|
| GenericKernelNxN (N: 2, 4, 8, 16, 32) | T: Copy + Zero + One + Mul + Add | Any |
| NeonKernel4x4 | f32 | aarch64 and target feature neon |
| NeonKernel8x8 | f32 | aarch64 and target feature neon |
Custom Kernel Implementation
use ;
;
Benchmarks
All benchmarks are performed in a single thread on square matrices of dimension n.
f32
PackSizes { mc: n, kc: n, nc: n }
aarch64 (M1)
n NeonKernel8x8 faer matrixmultiply
128 75.5µs 242.6µs 46.2µs
256 466.3µs 3.2ms 518.2µs
512 3ms 15.9ms 2.7ms
1024 23.9ms 128.4ms 22ms
2048 191ms 1s 182.8ms
License
Licensed under either of Apache License, Version 2.0 or MIT license at your option.