Skip to main content

Module tensor_tiled

Module tensor_tiled 

Source
Expand description

L2-cache-friendly tiled matrix multiplication engine. Tensor Tiling — L2-friendly tiled matrix multiplication.

Provides a tiled matmul implementation that operates on tiles that fit within the L2 cache, improving locality for large matrices.

§Determinism

  • Tile iteration order is deterministic (row-major over tiles).
  • The summation within each tile uses the same accumulation order.
  • Same inputs → bit-identical outputs on the same platform.

§Tile Size

Default tile size is 64×64 (32 KB per tile at f64, fits in most L2 caches). Configurable via TiledMatmul::with_tile_size().

Structs§

TiledMatmul
Tiled matrix multiplication engine.