Struct GgufFile

Source

pub struct GgufFile { /* private fields */ }

Expand description

A parsed GGUF file, ready for lazy tensor loading.

The file is kept open so that tensor data can be read on demand via load_tensor and load_tensor_f32.

Implementations§

Source §

impl GgufFile

Source

pub fn open(path: &Path) -> Result<Self>

Open and parse a GGUF v3 file.

This reads the full header (magic, version, tensor count, metadata KV pairs, tensor info entries) but does not read any tensor data. Tensor data is loaded lazily via load_tensor or load_tensor_f32.

§Errors

Returns MlxError::IoError if the file cannot be opened. Returns MlxError::GgufParseError if the file is not valid GGUF v3.

Source

pub fn metadata(&self, key: &str) -> Option<&MetadataValue>

Look up a metadata value by key.

Source

pub fn metadata_string(&self, key: &str) -> Option<&str>

Look up a metadata string value by key.

Source

pub fn metadata_u32(&self, key: &str) -> Option<u32>

Look up a metadata u32 value by key.

Source

pub fn metadata_f32(&self, key: &str) -> Option<f32>

Look up a metadata f32 value by key.

Source

pub fn tensor_names(&self) -> Vec<&str>

Return the names of all tensors in the file.

Source

pub fn tensor_info(&self, name: &str) -> Option<&TensorInfo>

Look up info for a specific tensor by name.

Source

pub fn tensor_count(&self) -> usize

Number of tensors in the file.

Source

pub fn metadata_count(&self) -> usize

Number of metadata key-value pairs.

Source

pub fn load_tensor(&self, name: &str, device: &MlxDevice) -> Result<MlxBuffer>

Load a tensor as a raw buffer on the Metal device.

For quantized types (Q4_0, Q8_0, Q4_K, Q6_K) the buffer contains raw GGML blocks with dtype U8 — these are consumed directly by quantized_matmul_ggml kernels.

For F32 and F16 tensors the buffer has the corresponding typed dtype.