Skip to main content

Module cpu

Module cpu 

Source
Expand description

CPU backend using Accelerate (macOS) / portable fallback (Linux). Context = () — all ops execute immediately, no batching needed.

Structs§

CpuBackend
CpuGptqStore
CPU-side GPTQ store — dequantized f32 weights in row-major [n, k] layout. Trades memory for simplicity: repack once at load, then run normal GEMM.

Enums§

CpuQuantStore
CPU-side container for any GGUF k-quant flavour. Each variant holds the dense fp32 weights post-eager-dequant — CPU isn’t the bench target so we don’t pay the complexity of on-the-fly dequant here; the variant tag exists so gemm_quant can route consistently.