Crate oxibonsai_core

Expand description

§oxibonsai-core

GGUF Q1_0_g128 format parser, tensor types, and model configuration for OxiBonsai — the Pure Rust 1-bit LLM inference engine.

This crate provides the foundational data types and parsing logic used by the rest of the OxiBonsai stack:

GGUF v3 binary format parsing — header, metadata key-value store, and tensor info directory (see gguf).
Q1_0_g128 block type — the 18-byte packed representation used for 1-bit weights (see tensor::BlockQ1_0G128).
Memory-mapped tensor loading — zero-copy access to weight data from disk via memmap2.
Model configuration — config::Qwen3Config extracted from GGUF metadata or constructed for known Bonsai variants (8B, 4B, 1.7B).

§GGUF Q1_0_g128 Format

Each block is 18 bytes: 2-byte FP16 scale + 16 bytes (128 sign bits). Weight = bit ? +scale : -scale. Effective 1.125 bits per weight.

§Crate Organisation

Module	Purpose
`config`	`Qwen3Config` with named constructors for each variant
`gguf`	Low-level GGUF v3 reader (header, metadata, tensors)
`quant_ternary`	`BlockTQ2_0_g128`, `BlockTQ2_0`, `TernaryCode` — ternary block types
`tensor`	`BlockQ1_0G128` and `OneBitTensor` types
`error`	`BonsaiError` / `BonsaiResult`

Re-exports§

pub use config::Qwen3Config;
pub use error::BonsaiError;
pub use error::BonsaiResult;
pub use gguf::compat::build_compat_report;
pub use gguf::compat::check_gguf_header;
pub use gguf::compat::CompatError;
pub use gguf::compat::ExtendedQuantType;
pub use gguf::compat::GgufCompatReport;
pub use gguf::compat::GgufVersion;
pub use gguf::header::GgufHeader;
pub use gguf::metadata::MetadataStore;
pub use gguf::metadata::MetadataValue;
pub use gguf::model_card::keys as model_card_keys;
pub use gguf::model_card::extract_known_fields;
pub use gguf::model_card::extract_model_card;
pub use gguf::model_card::ModelCard;
pub use gguf::streaming::GgufStreamParser;
pub use gguf::streaming::GgufValue;
pub use gguf::streaming::StreamState;
pub use gguf::streaming::StreamedGguf;
pub use gguf::streaming::StreamedTensorInfo;
pub use gguf::tensor_info::TensorInfo;
pub use gguf::tensor_info::TensorStore;
pub use gguf::types::GgufTensorType;
pub use gguf::types::GgufValueType;
pub use gguf::writer::MetadataWriteValue;
pub use gguf::writer::GgufWriter;
pub use gguf::writer::TensorEntry;
pub use gguf::writer::TensorType;
pub use gguf::writer::WriteError;
pub use quant_fp8::fp8_e4m3_decode;
pub use quant_fp8::fp8_e4m3_encode;
pub use quant_fp8::fp8_e5m2_decode;
pub use quant_fp8::fp8_e5m2_encode;
pub use quant_fp8::BlockFP8E4M3;
pub use quant_fp8::BlockFP8E5M2;
pub use quant_fp8::BLOCK_FP8_BYTES;
pub use quant_fp8::FP8_E4M3_MAX;
pub use quant_fp8::FP8_E5M2_MAX;
pub use quant_fp8::QK_FP8;
pub use quant_k::BlockQ2K;
pub use quant_k::BlockQ3K;
pub use quant_k::BlockQ4K;
pub use quant_k::BlockQ8K;
pub use quant_k::BLOCK_Q2_K_BYTES;
pub use quant_k::BLOCK_Q3K_BYTES;
pub use quant_k::BLOCK_Q4_K_BYTES;
pub use quant_k::BLOCK_Q8K_BYTES;
pub use quant_k_ext::BlockQ5K;
pub use quant_k_ext::BlockQ6K;
pub use quant_k_ext::BLOCK_Q5K_BYTES;
pub use quant_k_ext::BLOCK_Q6K_BYTES;
pub use quant_std::BlockQ4_0;
pub use quant_std::BlockQ8_0;
pub use quant_std::BLOCK_Q4_0_BYTES;
pub use quant_std::BLOCK_Q8_0_BYTES;
pub use quant_std::QK_Q4_0;
pub use quant_std::QK_Q8_0;
pub use quant_ternary::BlockTQ2_0;
pub use quant_ternary::BlockTQ2_0_g128;
pub use quant_ternary::TernaryCode;
pub use quant_ternary::BLOCK_TQ2_0_BYTES;
pub use quant_ternary::BLOCK_TQ2_0_G128_BYTES;
pub use quant_ternary::QK_TQ2_0;
pub use quant_ternary::QK_TQ2_0_G128;
pub use tensor::BlockQ1_0G128;
pub use tensor::OneBitTensor;

Modules§

config: Qwen3 model configuration extracted from GGUF metadata.
error: Error types for OxiBonsai core operations.
gguf: GGUF v3 binary format parser.
quant_fp8: FP8 quantization block types: E4M3FN and E5M2.
quant_k: K-quant block types for Q2_K, Q3_K, Q4_K, and Q8_K quantization formats.
quant_k_ext: K-quant block types for Q5_K and Q6_K quantization formats.
quant_std: Standard GGUF quantization block types: Q4_0 (4-bit) and Q8_0 (8-bit).
quant_ternary: Ternary quantization block types for TQ2_0_g128 and TQ2_0 formats.
tensor: Q1_0_g128 tensor types and 1-bit data access.

Crate oxibonsai_core

Crate oxibonsai_core Copy item path

§oxibonsai-core

§GGUF Q1_0_g128 Format

§Crate Organisation

Re-exports§

Modules§

Crate oxibonsai_core