Expand description
K-Quantization formats for GGUF/APR model weights (Toyota Way: ONE source of truth)
This crate provides quantization functions for converting F32 data to
K-quantization formats (Q4_K, Q5_K, Q6_K). This is the ONLY implementation
in the Sovereign AI Stack - aprender and realizar import from here.
§Stack Architecture (Toyota Way)
┌─────────┐
│ apr CLI │
└────┬────┘
│
┌───────┼───────┬───────────┐
▼ ▼ ▼ ▼
┌────────┐ ┌────────┐ ┌─────────┐
│entrenar│ │aprender│ │realizar │
└───┬────┘ └───┬────┘ └────┬────┘
│ │ │
└────┬─────┴───────────┴────┘
▼
┌────────────────┐
│ trueno-quant │ ← YOU ARE HERE
└───────┬────────┘
▼
┌────────────────┐
│ trueno │
└────────────────┘§Format Specifications
Q4_K: 256-element super-blocks, 144 bytes (4.5 bits/weight)Q5_K: 256-element super-blocks, 176 bytes (5.5 bits/weight)Q6_K: 256-element super-blocks, 210 bytes (6.5 bits/weight)
§Usage
use trueno_quant::{quantize_q4_k, dequantize_q4_k_to_f32};
let data: Vec<f32> = (0..256).map(|i| i as f32 / 10.0).collect();
let quantized = quantize_q4_k(&data);
let restored = dequantize_q4_k_to_f32(&quantized, 256);Constants§
- F16_
MIN_ NORMAL - Minimum valid f16 normal value (~6.1e-5) Prevents NaN on round-trip through f16 encoding
- Q4_
K_ BLOCK_ BYTES Q4_Ksuper-block byte size- Q4_
K_ BLOCK_ SIZE Q4_Ksuper-block size (elements per block)- Q5_
K_ BLOCK_ BYTES Q5_Ksuper-block byte size- Q5_
K_ BLOCK_ SIZE Q5_Ksuper-block size (elements per block)- Q6_
K_ BLOCK_ BYTES Q6_Ksuper-block byte size- Q6_
K_ BLOCK_ SIZE Q6_Ksuper-block size (elements per block)
Functions§
- dequantize_
q4_ k_ to_ f32 - Dequantize
Q4_Kbytes to F32 - dequantize_
q5_ k_ to_ f32 - Dequantize
Q5_Kbytes to F32 - dequantize_
q6_ k_ to_ f32 - Dequantize
Q6_Kbytes to F32 - f16_
to_ f32 - Convert f16 to f32 (using half crate)
- f32_
to_ f16 - Convert f32 to f16 (using half crate)
- quantize_
q4_ k - Quantize F32 data to
Q4_Kformat (llama.cpp/candle compatible) - quantize_
q4_ k_ matrix - Quantize F32 matrix to
Q4_Kformat with proper row layout - quantize_
q5_ k - Quantize F32 data to
Q5_Kformat - quantize_
q5_ k_ matrix - Quantize F32 matrix to
Q5_Kformat with proper row layout - quantize_
q6_ k - Quantize F32 data to
Q6_Kformat (candle/GGUF compatible) - quantize_
q6_ k_ matrix - Quantize F32 matrix to
Q6_Kformat with proper row layout - transpose_
q4k_ for_ matmul - Transpose Q4K tensor from GGUF column-major to APR row-major layout
- transpose_
q5k_ for_ matmul - Transpose Q5K tensor from GGUF column-major to APR row-major layout
- transpose_
q6k_ for_ matmul - Transpose Q6K tensor from GGUF column-major to APR row-major layout