Skip to main content

Module gguf

Module gguf 

Source
Expand description

GGUF v3 file format parser.

Parses GGUF headers, metadata, and tensor info on open. Tensor data is loaded lazily on demand into MlxBuffers — either as raw GGML blocks (for GPU quantized matmul) or dequantized to F32 (for norm weights etc.).

§Example

use mlx_native::gguf::GgufFile;
use std::path::Path;

let gguf = GgufFile::open(Path::new("model.gguf"))?;
let names = gguf.tensor_names();
let buf = gguf.load_tensor("blk.0.attn_q.weight", &device)?;
let norm = gguf.load_tensor_f32("blk.0.attn_norm.weight", &device)?;

Structs§

GgufFile
A parsed GGUF file, ready for lazy tensor loading.
TensorInfo
Information about a single tensor in the GGUF file.

Enums§

MetadataValue
GGUF metadata value types.