Expand description
§oxillama-gguf
GGUF v3 binary format parser and tensor loader for OxiLLaMa.
This crate provides complete parsing of the GGUF file format, including:
- Binary header validation (magic, version)
- Typed key-value metadata extraction
- Tensor info parsing (name, shape, quantization type, offset)
- Memory-mapped tensor data access (via
mmapfeature) - Full file loading with mmap or read-to-memory
§Supported GGUF Versions
- Version 2 (legacy)
- Version 3 (current standard)
§Quick Start
use oxillama_gguf::GgufModel;
let model = GgufModel::load("model.gguf").unwrap();
println!("Architecture: {}", model.architecture().unwrap());
println!("Tensors: {}", model.file.header.tensor_count);Re-exports§
pub use error::GgufError;pub use error::GgufResult;pub use header::GgufHeader;pub use metadata::MetadataStore;pub use metadata::MetadataValue;pub use parser::GgufFile;pub use reader::BinaryReader;pub use reader_core::align_up;pub use reader_core::parse_gguf;pub use reader_core::ParsedGguf;pub use source::SliceSource;pub use source::Source;pub use tensor_info::TensorInfo;pub use tensor_info::TensorStore;pub use types::GgufTensorType;pub use types::GgufValueType;pub use source::FileSource;stdpub use source::ReadSource;stdpub use loader::GgufModel;stdpub use safetensors::SafetensorsConverter;stdpub use quantize_on_load::QuantPlan;stdpub use quantize_on_load::QuantTarget;stdpub use resume::checkpoint_path_for;stdpub use resume::compute_fingerprint;stdpub use resume::compute_fingerprint_with_probe;stdpub use resume::load_checkpoint;stdpub use resume::save_checkpoint;stdpub use resume::validate_checkpoint;stdpub use resume::PrefixFingerprint;stdpub use resume::ResumeCheckpoint;stdpub use resume::ResumeHandle;stdpub use schema::validate_schema;stdpub use schema::SchemaValidator;stdpub use schema::SchemaViolation;stdpub use sharded::ShardedGgufModel;stdpub use streaming::StreamingGgufParser;stdpub use streaming::TensorInfoIter;stdpub use writer::GgufWriter;std
Modules§
- error
- Error types for the GGUF parser.
- header
- GGUF file header parsing.
- http_
source - HTTP Range-request backed
crate::source::Sourcefor remote GGUF loading. - loader
std - GGUF file loader with memory mapping support.
- metadata
- GGUF metadata key-value store with typed access.
- parser
- Complete GGUF file parser.
- quantize_
on_ load std - Quantize-on-load: convert F16/F32 tensors to Q4_0 or Q8_0 while loading.
- reader
- Binary reader utility for safe parsing of GGUF byte streams.
- reader_
core - Core GGUF parse logic generic over any
Source. - resume
std - Partial-download resume support for GGUF files.
- safetensors
std - Safetensors binary-format import bridge.
- schema
std - Pluggable metadata schema validators.
- sharded
std - Sharded GGUF model support.
- source
- Abstract byte-stream source for no_std-compatible GGUF parsing.
- streaming
std - Streaming / lazy GGUF parser.
- tensor_
info - GGUF tensor information and storage.
- types
- Core types for the GGUF format — quantization type IDs, value types, and constants.
- writer
std - GGUF v3 binary writer — serialize models to the GGUF format.