Skip to main content

Crate pmetal_gguf

Crate pmetal_gguf 

Source
Expand description

GGUF file format implementation.

GGUF (GGML Universal Format) is a file format for storing models for inference with GGML-based executors like llama.cpp and Ollama.

This crate provides:

  • Types representing the GGUF format
  • A reader for loading GGUF files
  • A writer for creating GGUF files
  • Dequantization routines for quantized tensors

§Example

use pmetal_gguf::{GgufContent, dequant};

// Read GGUF file
let content = GgufContent::from_file("model.gguf")?;

// Get architecture
if let Some(arch) = content.architecture() {
    println!("Model architecture: {}", arch);
}

// Read and dequantize a tensor
let mut file = std::fs::File::open("model.gguf")?;
let info = content.get_tensor_info("token_embd.weight").unwrap();
let data = content.read_tensor_data(&mut file, "token_embd.weight")?;
let shape: Vec<i32> = info.dimensions.iter().map(|&d| d as i32).collect();
let floats = dequant::dequantize(&data, info.dtype, &shape)?;

Re-exports§

pub use reader::GgufContent;
pub use reader::GgufReadError;
pub use reader::GgufVersion;
pub use reader::MAX_ARRAY_LENGTH;
pub use reader::MAX_METADATA_COUNT;
pub use reader::MAX_STRING_LENGTH;
pub use reader::MAX_TENSOR_COUNT;

Modules§

config
Generate HuggingFace-compatible config.json from GGUF metadata.
dequant
Dequantization routines for GGUF quantized tensors.
dynamic
Dynamic quantization scheduling.
imatrix
Importance Matrix (IMatrix) implementation.
iq_quants
IQ (Importance-weighted Quantization) block structures and operations.
k_quants
K-Quant (Q2K-Q8K) block structures and operations.
keys
Standard metadata keys for GGUF files.
quantize
K-Quant quantization utilities.
reader
GGUF file reader for loading quantized models.
tensors
Standard tensor names for transformer models.
vec_dot
SIMD-optimized vector dot product operations for K-quant blocks.

Structs§

GgufBuilder
Builder for creating GGUF files.
TensorInfo
Information about a tensor in the GGUF file.

Enums§

FileType
File type enumeration for quantized models.
GgmlType
GGML tensor data types.
MetadataValue
A metadata value in GGUF format.
MetadataValueType
GGUF metadata value types.
TensorSizeError
Error type for tensor size calculations.

Constants§

GGUF_DEFAULT_ALIGNMENT
Default alignment for tensor data.
GGUF_MAGIC
GGUF magic number: “GGUF” in bytes.
GGUF_VERSION
Current GGUF version (v3 with big-endian support).