Module gguf

Expand description

GGUF v3 file format parser.

Parses GGUF headers, metadata, and tensor info on open. Tensor data is loaded lazily on demand into MlxBuffers — either as raw GGML blocks (for GPU quantized matmul) or dequantized to F32 (for norm weights etc.).

§Example

use mlx_native::gguf::GgufFile;
use std::path::Path;

let gguf = GgufFile::open(Path::new("model.gguf"))?;
let names = gguf.tensor_names();
let buf = gguf.load_tensor("blk.0.attn_q.weight", &device)?;
let norm = gguf.load_tensor_f32("blk.0.attn_norm.weight", &device)?;

Structs§

GgufFile: A parsed GGUF file, ready for lazy tensor loading.
TensorInfo: Information about a single tensor in the GGUF file.

Enums§

MetadataValue: GGUF metadata value types.

Module gguf

Module gguf Copy item path

§Example

Structs§

Enums§

Module gguf