Skip to main content

Crate oxillama_gguf

Crate oxillama_gguf 

Source
Expand description

§oxillama-gguf

GGUF v3 binary format parser and tensor loader for OxiLLaMa.

This crate provides complete parsing of the GGUF file format, including:

  • Binary header validation (magic, version)
  • Typed key-value metadata extraction
  • Tensor info parsing (name, shape, quantization type, offset)
  • Memory-mapped tensor data access (via mmap feature)
  • Full file loading with mmap or read-to-memory

§Supported GGUF Versions

  • Version 2 (legacy)
  • Version 3 (current standard)

§Quick Start

use oxillama_gguf::GgufModel;

let model = GgufModel::load("model.gguf").unwrap();
println!("Architecture: {}", model.architecture().unwrap());
println!("Tensors: {}", model.file.header.tensor_count);

Re-exports§

pub use error::GgufError;
pub use error::GgufResult;
pub use header::GgufHeader;
pub use metadata::MetadataStore;
pub use metadata::MetadataValue;
pub use parser::GgufFile;
pub use reader::BinaryReader;
pub use reader_core::align_up;
pub use reader_core::parse_gguf;
pub use reader_core::ParsedGguf;
pub use source::SliceSource;
pub use source::Source;
pub use tensor_info::TensorInfo;
pub use tensor_info::TensorStore;
pub use types::GgufTensorType;
pub use types::GgufValueType;
pub use source::FileSource;std
pub use source::ReadSource;std
pub use loader::GgufModel;std
pub use safetensors::SafetensorsConverter;std
pub use quantize_on_load::QuantPlan;std
pub use quantize_on_load::QuantTarget;std
pub use resume::checkpoint_path_for;std
pub use resume::compute_fingerprint;std
pub use resume::compute_fingerprint_with_probe;std
pub use resume::load_checkpoint;std
pub use resume::save_checkpoint;std
pub use resume::validate_checkpoint;std
pub use resume::PrefixFingerprint;std
pub use resume::ResumeCheckpoint;std
pub use resume::ResumeHandle;std
pub use schema::validate_schema;std
pub use schema::SchemaValidator;std
pub use schema::SchemaViolation;std
pub use sharded::ShardedGgufModel;std
pub use streaming::StreamingGgufParser;std
pub use streaming::TensorInfoIter;std
pub use writer::GgufWriter;std

Modules§

error
Error types for the GGUF parser.
header
GGUF file header parsing.
http_source
HTTP Range-request backed crate::source::Source for remote GGUF loading.
loaderstd
GGUF file loader with memory mapping support.
metadata
GGUF metadata key-value store with typed access.
parser
Complete GGUF file parser.
quantize_on_loadstd
Quantize-on-load: convert F16/F32 tensors to Q4_0 or Q8_0 while loading.
reader
Binary reader utility for safe parsing of GGUF byte streams.
reader_core
Core GGUF parse logic generic over any Source.
resumestd
Partial-download resume support for GGUF files.
safetensorsstd
Safetensors binary-format import bridge.
schemastd
Pluggable metadata schema validators.
shardedstd
Sharded GGUF model support.
source
Abstract byte-stream source for no_std-compatible GGUF parsing.
streamingstd
Streaming / lazy GGUF parser.
tensor_info
GGUF tensor information and storage.
types
Core types for the GGUF format — quantization type IDs, value types, and constants.
writerstd
GGUF v3 binary writer — serialize models to the GGUF format.