Expand description
CPU kernels — one sub-module per family.
Modules§
- attention
- Scaled dot-product attention — Flash-Edge (tiled online-softmax) implementation.
- conv2d
- 2-D convolution via Im2Col + GEMM.
- elementwise
- Element-wise CPU kernels — arithmetic, activations, and mathematical ops.
- layernorm
- LayerNorm and RMSNorm kernels.
- matmul
- Matrix multiplication kernels using the
matrixmultiplycrate. - quant
- Quantized weight storage and on-the-fly dequantizing dot-products.
- reduce
- Reduction kernels: sum, mean, max, min.
- rope
- Rotary Position Embedding (RoPE) kernel.
- softmax
- Numerically stable softmax kernel.