Skip to main content

Module gptq

Module gptq 

Source
Expand description

GPTQ linear projection — thin factory wrapper.

Phase 3e/2: the actual kernel dispatch lives inside the boxed Linear<B> returned by B::load_gptq (CudaMarlinLinear on CUDA, CpuGptqLinear on CPU). This module just re-exposes the historical constructor names so callers don’t have to switch.

Structs§

GptqLinear
GPTQ-format Linear projection, polymorphic over backend.
StackedExpertLinear
View into a single column-slice of a shared stacked GPTQ store.