pub trait QuantLlmBackend:
LlmBackend
+ BackendQuantMarlin
+ BackendQuantGguf { }Expand description
LLM backend that also supports quantized weight loading (GPTQ Marlin
for CUDA; GGUF k-quant for Metal). Required by models that hold
Box<dyn Linear<B>> where the Linear impl might be a quant variant.
Dyn Compatibility§
This trait is not dyn compatible.
In older versions of Rust, dyn compatibility was called "object safety".