metaltile-std 0.1.0

MetalTile kernel standard library — benchmark metadata and type definitions
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
//! Copyright 2026 0xClandestine, Ekryski, TheTom, Ambisphaeric
//! SPDX-License-Identifier: Apache-2.0
//! Steel implicit-GEMM 2D conv — metal/steel/conv/kernels/steel_conv.metal
//!
//! 2D convolution via implicit GEMM (im2col × filter matrix):
//!   implicit_gemm_conv_2d_{dtype}_bm{M}_bn{N}_bk{K}_wm{wm}_wn{wn}_channel_{c}_filter_{f}
//!   Block shapes: 32×8, 64×8, 32×32, 32×64, 64×32, 64×64 (all ×16 K)
//!   Channel variants: l (general), 1/2/3/4 (small fixed channel count)
//!   Filter variants: s (small/separable), l (large)
//!   Dtypes: float32, float16, bfloat16
//!
//! NOT YET IMPLEMENTED in #[kernel] DSL:
//!   Implicit GEMM convolution unfolds the input patch neighbourhood
//!   into a virtual matrix via the `MLXConvParams` descriptor, then
//!   runs tiled GEMM over the unfolded layout. This requires simdgroup
//!   matrix ops (same blocker as steel_gemm_fused) plus the im2col
//!   index arithmetic driven by `ImplicitGemmConv2DParams`. The DSL
//!   has neither simdgroup matmul nor im2col primitives.