microgemm
General matrix multiplication with custom configuration in Rust.
Supports no_std and no_alloc environments.
The implementation is based on the BLIS microkernel approach.
Usage
The Kernel trait is the main abstraction of microgemm.
You can implement it yourself or use kernels that are already provided out of the box.
Implemented Kernels
| Name | Scalar Types | Target |
|---|---|---|
| GenericNxNKernel (N: 2, 4, 8, 16, 32) | T: Copy + Zero + One + Mul + Add | Any |
| NeonKernel | f32 | AArch64 and target feature neon |
| WasmSimd128Kernel | f32 | wasm32 and target feature simd128 |
gemm
use microgemm as mg;
use Kernel as _;
Also see no_alloc example for use without Vec.
Custom Kernel Implementation
use ;
;
Benchmarks
All benchmarks are performed on square matrices of dimension n and
with pack_sizes == PackSizes { mc: n, kc: n, nc: n }.
AArch64 (M1)
f32
n NeonKernel Generic4x4 Generic8x8 naive(rustc)
32 10.7µs 13.9µs 12.7µs 53.2µs
64 50.6µs 73µs 62.7µs 307.7µs
128 257.5µs 482.8µs 379.8µs 2.5ms
256 1ms 2ms 1.3ms 9.5ms
512 3.4ms 8.4ms 6ms 94.5ms
1024 25ms 66.4ms 46.4ms 882.7ms
License
Licensed under either of Apache License, Version 2.0 or MIT license at your option.