Crate trueno_quant

Expand description

K-Quantization formats for GGUF/APR model weights (Toyota Way: ONE source of truth)

This crate provides quantization functions for converting F32 data to K-quantization formats (Q4_K, Q5_K, Q6_K). This is the ONLY implementation in the Sovereign AI Stack - aprender and realizar import from here.

§Stack Architecture (Toyota Way)

       ┌─────────┐
       │ apr CLI │
       └────┬────┘
            │
    ┌───────┼───────┬───────────┐
    ▼       ▼       ▼           ▼
┌────────┐ ┌────────┐ ┌─────────┐
│entrenar│ │aprender│ │realizar │
└───┬────┘ └───┬────┘ └────┬────┘
    │          │           │
    └────┬─────┴───────────┴────┘
         ▼
      ┌────────────────┐
      │  trueno-quant  │  ← YOU ARE HERE
      └───────┬────────┘
              ▼
      ┌────────────────┐
      │     trueno     │
      └────────────────┘

§Format Specifications

Q4_K: 256-element super-blocks, 144 bytes (4.5 bits/weight)
Q5_K: 256-element super-blocks, 176 bytes (5.5 bits/weight)
Q6_K: 256-element super-blocks, 210 bytes (6.5 bits/weight)

§Usage

use trueno_quant::{quantize_q4_k, dequantize_q4_k_to_f32};

let data: Vec<f32> = (0..256).map(|i| i as f32 / 10.0).collect();
let quantized = quantize_q4_k(&data);
let restored = dequantize_q4_k_to_f32(&quantized, 256);

Constants§

F16_MIN_NORMAL: Minimum valid f16 normal value (~6.1e-5) Prevents NaN on round-trip through f16 encoding
Q4_K_BLOCK_BYTES: Q4_K super-block byte size
Q4_K_BLOCK_SIZE: Q4_K super-block size (elements per block)
Q5_K_BLOCK_BYTES: Q5_K super-block byte size
Q5_K_BLOCK_SIZE: Q5_K super-block size (elements per block)
Q6_K_BLOCK_BYTES: Q6_K super-block byte size
Q6_K_BLOCK_SIZE: Q6_K super-block size (elements per block)

Functions§

dequantize_q4_k_to_f32: Dequantize Q4_K bytes to F32
dequantize_q5_k_to_f32: Dequantize Q5_K bytes to F32
dequantize_q6_k_to_f32: Dequantize Q6_K bytes to F32
f16_to_f32: Convert f16 to f32 (using half crate)
f32_to_f16: Convert f32 to f16 (using half crate)
quantize_q4_k: Quantize F32 data to Q4_K format (llama.cpp/candle compatible)
quantize_q4_k_matrix: Quantize F32 matrix to Q4_K format with proper row layout
quantize_q5_k: Quantize F32 data to Q5_K format
quantize_q5_k_matrix: Quantize F32 matrix to Q5_K format with proper row layout
quantize_q6_k: Quantize F32 data to Q6_K format (candle/GGUF compatible)
quantize_q6_k_matrix: Quantize F32 matrix to Q6_K format with proper row layout
transpose_q4k_for_matmul: Transpose Q4K tensor from GGUF column-major to APR row-major layout
transpose_q5k_for_matmul: Transpose Q5K tensor from GGUF column-major to APR row-major layout
transpose_q6k_for_matmul: Transpose Q6K tensor from GGUF column-major to APR row-major layout

Crate trueno_quant

Crate trueno_quant Copy item path

§Stack Architecture (Toyota Way)

§Format Specifications

§Usage

Constants§

Functions§

Crate trueno_quant