Struct GgufLoader

Source

pub struct GgufLoader { /* private fields */ }

Implementations§

Source §

pub fn mtp_layer_threshold(&self) -> Option<u32>

First blk.N index that the GGUF metadata reports as an MTP head, derived from {arch}.block_count - {arch}.nextn_predict_layers. None for files where the nextn_predict_layers key is absent (= no MTP, or MTP is encoded under a different naming scheme — fall back to is_mtp_weight in that case).

Source

pub fn file(&self) -> &GgufFile

Borrow the underlying parsed GgufFile so callers (e.g. arch builders that read general.architecture-specific keys) don’t have to re-parse 800+ tensor headers a second time.

Source

pub fn tensor_bytes_borrowed(&self, key: &str) -> Option<&[u8]>

Borrow the raw on-disk byte slice for a tensor without marking it taken. Returns None if the key doesn’t resolve or the byte range is invalid. Used by the qwen35 packed- upload path to stream K-quant bytes from mmap straight into the compiled arena, skipping a per-tensor Vec<u8> allocation (≈ 16 GB on Qwen3.6-27B Q4_K_M).

Source

pub fn take_packed_metadata( &mut self, key: &str, ) -> Result<Option<(QuantScheme, Vec<usize>)>, Error>

Variant of Self::take_packed that returns only the (scheme, shape) metadata without copying bytes. The caller uploads bytes separately via Self::tensor_bytes_borrowed after the graph is compiled — eliminates the per-tensor Vec<u8> allocation. Marks the tensor taken on success; returns Ok(None) for non-K-quant dtypes so the caller can fall back to the dequant path.

Source

pub fn is_mtp_tensor(&self, name: &str) -> bool

True if name is an MTP weight under this file’s naming scheme. Combines the substring heuristic (is_mtp_weight) with the model-aware blk.N where N >= threshold check.

Source

pub fn include_mtp(&mut self, include: bool) -> &mut GgufLoader

Toggle MTP-weight visibility. With include = true, MTP heads show up in remaining_keys() (and count toward len()) — drain-style consumers like Qwen3Generator::from_loader will then pull them into the weights cache. Default off so non-MTP models behave exactly as before. Call this before any take() / drain so the inclusion choice is consistent across the load.

Source

pub fn take_packed( &mut self, key: &str, ) -> Result<Option<(Vec<u8>, QuantScheme, Vec<usize>)>, Error>

Take a tensor’s packed bytes (no dequant), plus its rlx_ir::quant::QuantScheme and safetensors-style shape. Returns None when the tensor is stored uncompressed (F32/F16/BF16) — caller should fall back to take() for those.

Used by the qwen3 builder’s packed-weights mode: the LM head + per-layer matmul weights stay in the arena as raw K-quant bytes, and the graph emits Op::DequantMatMul { scheme } instead of Op::MatMul for them. Cuts the load-time memory footprint by ~7-9× on Q4_K_M / Q6_K models — the unblocker for ≥14 B Qwen3 / Llama GGUFs on commodity Macs.

Source

pub fn take_mtp(&mut self, key: &str) -> Result<(Vec<f32>, Vec<usize>), Error>

Take a single MTP weight by name. Bypasses the include_mtp filter so callers can grab specific heads without flipping the global visibility. Returns an error if the name isn’t a recognized MTP weight (use [take] for non-MTP keys).

Source §

impl GgufLoader

Source

pub fn mtp_keys(&self) -> Vec<String>

Tensor names that look like MTP heads under this file’s scheme (combines the substring heuristic with the model-aware blk.N where N >= threshold check — see is_mtp_tensor). Returned unfiltered by remaining_keys so consumers wanting to wire MTP can find them explicitly.

Trait Implementations§

Source §

impl WeightLoader for GgufLoader

Source §

fn take_transposed( &mut self, key: &str, ) -> Result<(Vec<f32>, Vec<usize>), Error>

BREAKING CHANGE in 0.2.0: prior to 0.2.0 this method was a no-op for GGUF (returned the bytes unchanged with the GGUF shape label) which silently produced garbage logits when the builder expected [in, out] row-major. From 0.2.0 onwards take normalizes GGUF’s reverse-shape convention so this method matches the safetensors variant byte-for-byte. Downstream code that explicitly worked around the old buggy behavior (manually re-transposing the result) must drop that workaround.

Source §

fn format_id(&self) -> &'static str

Format id (safetensors, gguf, or a custom registration).

Source §

fn arch_hint(&self) -> Option<&str>

Architecture name from the underlying file (general.architecture for GGUF, None for safetensors). Drain-style consumers use this to pick an arch-specific reverse name mapping when the canonical HF name depends on the model family (e.g. Gemma 2’s 4 norms per layer don’t share the Llama 2-norm reverse alias).

Source §

fn take_packed( &mut self, key: &str, ) -> Result<Option<(Vec<u8>, QuantScheme, Vec<usize>)>, Error>

Take packed K-quant bytes when supported; default returns None.

Source §

fn tensor_bytes_borrowed(&self, key: &str) -> Option<&[u8]>

Borrow packed bytes without marking taken (GGUF mmap path).

Source §

fn len(&self) -> usize

Number of distinct weights in the file.

Source §

fn take(&mut self, key: &str) -> Result<(Vec<f32>, Vec<usize>), Error>

Take the named tensor as (f32_data, shape). Removes from the loader so callers can detect “weights I never used.”

Source §

fn remaining_keys(&self) -> Vec<String>

Names that haven’t been taken yet — useful for “did the model use every weight?” hygiene checks.

Source §

fn is_empty(&self) -> bool

Auto Trait Implementations§

§

impl Freeze for GgufLoader

§

impl RefUnwindSafe for GgufLoader

§

impl Send for GgufLoader

§

impl Sync for GgufLoader

§

impl Unpin for GgufLoader

§

impl UnsafeUnpin for GgufLoader

§

impl UnwindSafe for GgufLoader

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> IntoEither for T

Source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

impl<T> Pointable for T

Source §

const ALIGN: usize

The alignment of pointer.

Source §

type Init = T

The type for initializers.

Source §

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more

Source §

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more

Source §

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more

Source §

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more

Source §

impl<T, U> TryFrom for T
where U: Into<T>,

Source §

type Error = Infallible

The type returned in the event of a conversion error.

Source §

fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>

Performs the conversion.

Source §

impl<T, U> TryInto for T
where U: TryFrom<T>,

Source §

type Error = >::Error

The type returned in the event of a conversion error.

Source §

fn try_into(self) -> Result<U, >::Error>

Performs the conversion.

Source §

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source §

Struct GgufLoader Copy item path

Implementations§

impl GgufLoader

pub fn from_file(path: &str) -> Result<GgufLoader, Error>

pub fn architecture(&self) -> &str

pub fn mtp_layer_threshold(&self) -> Option<u32>

pub fn file(&self) -> &GgufFile

pub fn tensor_bytes_borrowed(&self, key: &str) -> Option<&[u8]>

pub fn take_packed_metadata( &mut self, key: &str, ) -> Result<Option<(QuantScheme, Vec<usize>)>, Error>

pub fn is_mtp_tensor(&self, name: &str) -> bool

pub fn include_mtp(&mut self, include: bool) -> &mut GgufLoader

pub fn take_packed( &mut self, key: &str, ) -> Result<Option<(Vec<u8>, QuantScheme, Vec<usize>)>, Error>

pub fn take_mtp(&mut self, key: &str) -> Result<(Vec<f32>, Vec<usize>), Error>

impl GgufLoader

pub fn mtp_keys(&self) -> Vec<String>

Trait Implementations§

impl WeightLoader for GgufLoader

fn take_transposed( &mut self, key: &str, ) -> Result<(Vec<f32>, Vec<usize>), Error>

fn format_id(&self) -> &'static str

fn arch_hint(&self) -> Option<&str>

fn take_packed( &mut self, key: &str, ) -> Result<Option<(Vec<u8>, QuantScheme, Vec<usize>)>, Error>

fn tensor_bytes_borrowed(&self, key: &str) -> Option<&[u8]>

fn len(&self) -> usize

fn take(&mut self, key: &str) -> Result<(Vec<f32>, Vec<usize>), Error>

fn remaining_keys(&self) -> Vec<String>

fn is_empty(&self) -> bool

Auto Trait Implementations§

impl Freeze for GgufLoader

impl RefUnwindSafe for GgufLoader

impl Send for GgufLoader

impl Sync for GgufLoader

impl Unpin for GgufLoader

impl UnsafeUnpin for GgufLoader

impl UnwindSafe for GgufLoader

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> IntoEither for T

fn into_either(self, into_left: bool) -> Either<Self, Self>

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>where F: FnOnce(&Self) -> bool,

impl<T> Pointable for T

const ALIGN: usize

type Init = T

unsafe fn init(init: <T as Pointable>::Init) -> usize

unsafe fn deref<'a>(ptr: usize) -> &'a T

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

unsafe fn drop(ptr: usize)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<V, T> VZip<V> for Twhere V: MultiLane<T>,

fn vzip(self) -> V

Struct GgufLoader

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T, U> Into<U> for T
where U: From<T>,

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

impl<V, T> VZip<V> for T
where V: MultiLane<T>,