Expand description
GPTQ linear projection — thin factory wrapper.
Phase 3e/2: the actual kernel dispatch lives inside the boxed
Linear<B> returned by B::load_gptq (CudaMarlinLinear on
CUDA, CpuGptqLinear on CPU). This module just re-exposes the
historical constructor names so callers don’t have to switch.
Structs§
- Gptq
Linear - GPTQ-format Linear projection, polymorphic over backend.
- Stacked
Expert Linear - View into a single column-slice of a shared stacked GPTQ store.