Skip to main content

QuantLlmBackend

Trait QuantLlmBackend 

Source
pub trait QuantLlmBackend:
    LlmBackend
    + BackendQuantMarlin
    + BackendQuantGguf { }
Expand description

LLM backend that also supports quantized weight loading (GPTQ Marlin for CUDA; GGUF k-quant for Metal). Required by models that hold Box<dyn Linear<B>> where the Linear impl might be a quant variant.

Dyn Compatibility§

This trait is not dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety".

Implementors§