Skip to main content

Module tiling

Module tiling 

Source
Expand description

Tiling Compute Blocks (TCB) - Work Partitioning for High-Performance Kernels

TCBs represent the fundamental unit of work partitioning within ComputeBrick kernels. While a ComputeBrick defines a logical operation (e.g., Q4_K MatMul), a TCB defines the physical execution strategy—how data is partitioned across the memory hierarchy.

§Architecture

Tiling occurs at three levels:

  1. Macro-Tile (L3/Global Memory): Partitioning across CPU sockets or GPU SMs
  2. Midi-Tile (L2/Shared Memory): Partitioning within a thread block or Rayon task
  3. Micro-Tile (Registers): Smallest unit processed by SIMD or CUDA warps

§Modules

  • geometry - TcbGeometry dimensions and level definitions
  • config - TilingConfig and backend selection
  • calculator - TcbIndexCalculator for index computation
  • packing - Memory layout packing utilities
  • prefetch - Prefetch locality hints
  • q4k_matvec - Q4_K quantized matrix-vector tiling
  • error - TilingError types

Structs§

TcbGeometry
Dimensions for a Tiling Compute Block
TcbIndexCalculator
Index calculator for hierarchical tiling
TiledQ4KMatvec
Tiled Q4_K MatVec executor
TilingConfig
Complete tiling configuration for a kernel
TilingStats
Statistics for a tiled operation

Enums§

PackingLayout
Memory layout for packed matrices
PrefetchLocality
Prefetch locality hint
TcbLevel
Tiling hierarchy level
TilingBackend
Backend target for tiling configuration
TilingError
Tiling configuration errors

Constants§

Q4K_SUPERBLOCK_BYTES
Q4K_SUPERBLOCK_SIZE
Q4_K superblock constants (per GGML specification)

Functions§

extract_scale_min_6bit
Extract 6-bit scale and min values from packed scales array
f16_to_f32
Convert 2 bytes (f16 IEEE 754) to f32
optimal_prefetch_distance
Calculate optimal prefetch distance based on tile geometry and cache level
pack_a_index
Calculate packed index for panel-major A layout
pack_b_index
Calculate packed index for panel-major B layout
swizzle_index
Apply XOR swizzling for shared memory bank conflict avoidance