Crate oxillama_gguf

Source

Expand description

§oxillama-gguf

GGUF v3 binary format parser and tensor loader for OxiLLaMa.

This crate provides complete parsing of the GGUF file format, including:

Binary header validation (magic, version)
Typed key-value metadata extraction
Tensor info parsing (name, shape, quantization type, offset)
Memory-mapped tensor data access (via mmap feature)
Full file loading with mmap or read-to-memory

§Supported GGUF Versions

Version 2 (legacy)
Version 3 (current standard)

§Quick Start

use oxillama_gguf::GgufModel;

let model = GgufModel::load("model.gguf").unwrap();
println!("Architecture: {}", model.architecture().unwrap());
println!("Tensors: {}", model.file.header.tensor_count);

Re-exports§

pub use error::GgufError;
pub use error::GgufResult;
pub use header::GgufHeader;
pub use metadata::MetadataStore;
pub use metadata::MetadataValue;
pub use parser::GgufFile;
pub use reader::BinaryReader;
pub use reader_core::align_up;
pub use reader_core::parse_gguf;
pub use reader_core::ParsedGguf;
pub use source::SliceSource;
pub use source::Source;
pub use tensor_info::TensorInfo;
pub use tensor_info::TensorStore;
pub use types::GgufTensorType;
pub use types::GgufValueType;
pub use source::FileSource;std
pub use source::ReadSource;std
pub use loader::GgufModel;std
pub use safetensors::SafetensorsConverter;std
pub use quantize_on_load::QuantPlan;std
pub use quantize_on_load::QuantTarget;std
pub use resume::checkpoint_path_for;std
pub use resume::compute_fingerprint;std
pub use resume::compute_fingerprint_with_probe;std
pub use resume::load_checkpoint;std
pub use resume::save_checkpoint;std
pub use resume::validate_checkpoint;std
pub use resume::PrefixFingerprint;std
pub use resume::ResumeCheckpoint;std
pub use resume::ResumeHandle;std
pub use schema::validate_schema;std
pub use schema::SchemaValidator;std
pub use schema::SchemaViolation;std
pub use sharded::ShardedGgufModel;std
pub use streaming::StreamingGgufParser;std
pub use streaming::TensorInfoIter;std
pub use writer::GgufWriter;std

Modules§

error: Error types for the GGUF parser.
header: GGUF file header parsing.
http_source: HTTP Range-request backed crate::source::Source for remote GGUF loading.
loaderstd: GGUF file loader with memory mapping support.
metadata: GGUF metadata key-value store with typed access.
parser: Complete GGUF file parser.
quantize_on_loadstd: Quantize-on-load: convert F16/F32 tensors to Q4_0 or Q8_0 while loading.
reader: Binary reader utility for safe parsing of GGUF byte streams.
reader_core: Core GGUF parse logic generic over any Source.
resumestd: Partial-download resume support for GGUF files.
safetensorsstd: Safetensors binary-format import bridge.
schemastd: Pluggable metadata schema validators.
shardedstd: Sharded GGUF model support.
source: Abstract byte-stream source for no_std-compatible GGUF parsing.
streamingstd: Streaming / lazy GGUF parser.
tensor_info: GGUF tensor information and storage.
types: Core types for the GGUF format — quantization type IDs, value types, and constants.
writerstd: GGUF v3 binary writer — serialize models to the GGUF format.

Crate oxillama_gguf

Crate oxillama_gguf Copy item path

§oxillama-gguf

§Supported GGUF Versions

§Quick Start

Re-exports§

Modules§

Crate oxillama_gguf