Expand description
GGUF v3 file format parser.
Parses GGUF headers, metadata, and tensor info on open. Tensor data is
loaded lazily on demand into MlxBuffers — either as raw GGML blocks
(for GPU quantized matmul) or dequantized to F32 (for norm weights etc.).
§Example
ⓘ
use mlx_native::gguf::GgufFile;
use std::path::Path;
let gguf = GgufFile::open(Path::new("model.gguf"))?;
let names = gguf.tensor_names();
let buf = gguf.load_tensor("blk.0.attn_q.weight", &device)?;
let norm = gguf.load_tensor_f32("blk.0.attn_norm.weight", &device)?;Structs§
- Gguf
File - A parsed GGUF file, ready for lazy tensor loading.
- Tensor
Info - Information about a single tensor in the GGUF file.
Enums§
- Metadata
Value - GGUF metadata value types.