Expand description
Model quantization for AxonML — GGUF formats + BitNet I2_S ternary.
types (QuantType enum, block structs for Q8_0/Q4_0/Q4_1/Q5_0/Q5_1/F16),
quantize (tensor/model quantization with RMSE error analysis),
dequantize (block/tensor reconstruction to f32), bitnet (I2_S 1.58-bit
ternary — 128-weight blocks, fused add-only matmul, int8 activation
quantizer, AVX-VNNI dispatch scaffolded), calibration (MinMax/Percentile/
MeanStd/Entropy methods), inference (QuantizedLinear drop-in layer,
QuantizedModel wrapper), error (QuantError/QuantResult).
§File
crates/axonml-quant/src/lib.rs
§Author
Andrew Jewell Sr. — AutomataNexus LLC ORCID: 0009-0005-2158-7060
§Updated
April 14, 2026 11:15 PM EST
§Disclaimer
Use at own risk. This software is provided “as is”, without warranty of any kind, express or implied. The author and AutomataNexus shall not be held liable for any damages arising from the use of this software.
Re-exports§
pub use calibration::CalibrationData;pub use calibration::calibrate;pub use dequantize::dequantize_block;pub use dequantize::dequantize_tensor;pub use error::QuantError;pub use error::QuantResult;pub use inference::QuantizedLinear;pub use inference::QuantizedModel;pub use inference::deserialize_quantized;pub use inference::serialize_quantized;pub use quantize::quantize_model;pub use quantize::quantize_tensor;pub use types::QuantType;pub use types::QuantizedBlock;pub use types::QuantizedTensor;
Modules§
- bitnet
- BitNet b1.58 I2_S Ternary Quantization — Dequant + Fused Add-Only Matmul
- calibration
- Calibration for Quantization
- dequantize
- Dequantization Functions
- error
- Quantization Error Types — Block, Shape, and Calibration Failures
- inference
- Quantized Inference — fast inference with quantized weights
- quantize
- Quantization Functions
- types
- Quantization Types
Constants§
- DEFAULT_
BLOCK_ SIZE - Default block size for quantization.
- MAX_
BLOCK_ SIZE - Maximum block size supported.