pub enum MatWeight {
F32(Vec<f32>),
Packed {
key: String,
scheme: QuantScheme,
shape: Vec<usize>,
},
}Expand description
Storage variant for matmul weight tensors. The big projections
(qkv / gate / ffn / lm_head) dominate the load footprint; the
Packed variant keeps GGUF K-quant bytes in-place so the graph
can emit Op::DequantMatMul instead of a full F32 dequant.
Norm weights, conv kernels, scalar params etc. stay as
Vec<f32> in the layer structs (their footprint is negligible
and the RmsNorm / Conv ops don’t have a packed variant).
Variants§
F32(Vec<f32>)
Already dequantized to f32, row-major [out, in]. The
builder transposes to [in, out] before issuing MatMul.
Packed
GGUF-packed K-quant metadata only. The actual bytes are
looked up in the loader at upload time via
rlx_core::weight_loader::GgufLoader::tensor_bytes_borrowed
— eliminates the per-tensor Vec<u8> allocation that
otherwise costs ~16 GB of memcpy on Qwen3.6-27B Q4_K_M.
key is the loader-resolvable name (post-HF↔GGUF mapping);
shape is [out, in] after the safetensors-style dim
reversal.
Implementations§
Source§impl MatWeight
impl MatWeight
pub fn len(&self) -> usize
pub fn is_empty(&self) -> bool
Sourcepub fn shape(&self) -> &[usize]
pub fn shape(&self) -> &[usize]
[out, in] on-disk shape. For the F32 variant the caller is
expected to track this externally (we return an empty Vec).
pub fn is_packed(&self) -> bool
Sourcepub fn packed_key(&self) -> Option<&str>
pub fn packed_key(&self) -> Option<&str>
Loader-resolvable key for the packed variant. None for F32.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for MatWeight
impl RefUnwindSafe for MatWeight
impl Send for MatWeight
impl Sync for MatWeight
impl Unpin for MatWeight
impl UnsafeUnpin for MatWeight
impl UnwindSafe for MatWeight
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more