pub struct QuantLinear<B: Backend + BackendQuantGguf> { /* private fields */ }Expand description
Linear projection backed by a GGUF k-quant weight.
forward() is a tail-call to the inner backend-specific Linear
(Metal: MetalGgufLinear, CPU: CpuGgufLinear). LTO inlines through
the dispatch.
Implementations§
Source§impl<B: Backend + BackendQuantGguf> QuantLinear<B>
impl<B: Backend + BackendQuantGguf> QuantLinear<B>
Sourcepub fn from_gguf_bytes(
kind: GgufQuantType,
bytes: &[u8],
out_features: usize,
in_features: usize,
) -> Result<Self>
pub fn from_gguf_bytes( kind: GgufQuantType, bytes: &[u8], out_features: usize, in_features: usize, ) -> Result<Self>
Build from raw GGUF block bytes.
kind: which k-quant flavour the bytes encode (Q4_K, Q5_K, …).
bytes: the on-disk payload, sized by the kind’s block layout.
Sourcepub fn from_gguf_fused(
parts: &[(GgufQuantType, &[u8], usize)],
in_features: usize,
) -> Result<Self>
pub fn from_gguf_fused( parts: &[(GgufQuantType, &[u8], usize)], in_features: usize, ) -> Result<Self>
Build a fused projection from multiple (kind, bytes, rows)
parts that share in_features. Each part stays in its own
QuantStore (no byte-concat); forward dispatches one matvec per
part. Used for Qwen3 qkv_proj when q+k are Q4_K and v is Q6_K
— the homogeneous fused-Q4 fast path would have to fall back
to eager-fp32, blowing 100 MB per layer.
Trait Implementations§
Source§impl<B: Backend + BackendQuantGguf> Linear<B> for QuantLinear<B>
impl<B: Backend + BackendQuantGguf> Linear<B> for QuantLinear<B>
Auto Trait Implementations§
impl<B> !RefUnwindSafe for QuantLinear<B>
impl<B> !UnwindSafe for QuantLinear<B>
impl<B> Freeze for QuantLinear<B>
impl<B> Send for QuantLinear<B>
impl<B> Sync for QuantLinear<B>
impl<B> Unpin for QuantLinear<B>
impl<B> UnsafeUnpin for QuantLinear<B>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
impl<T> ErasedDestructor for Twhere
T: 'static,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more