Skip to main content

Crate trueno_quant

Crate trueno_quant 

Source
Expand description

K-Quantization formats for GGUF/APR model weights (Toyota Way: ONE source of truth)

This crate provides quantization functions for converting F32 data to K-quantization formats (Q4_K, Q5_K, Q6_K). This is the ONLY implementation in the Sovereign AI Stack - aprender and realizar import from here.

§Stack Architecture (Toyota Way)

       ┌─────────┐
       │ apr CLI │
       └────┬────┘
            │
    ┌───────┼───────┬───────────┐
    ▼       ▼       ▼           ▼
┌────────┐ ┌────────┐ ┌─────────┐
│entrenar│ │aprender│ │realizar │
└───┬────┘ └───┬────┘ └────┬────┘
    │          │           │
    └────┬─────┴───────────┴────┘
         ▼
      ┌────────────────┐
      │  trueno-quant  │  ← YOU ARE HERE
      └───────┬────────┘
              ▼
      ┌────────────────┐
      │     trueno     │
      └────────────────┘

§Format Specifications

  • Q4_K: 256-element super-blocks, 144 bytes (4.5 bits/weight)
  • Q5_K: 256-element super-blocks, 176 bytes (5.5 bits/weight)
  • Q6_K: 256-element super-blocks, 210 bytes (6.5 bits/weight)

§Usage

use trueno_quant::{quantize_q4_k, dequantize_q4_k_to_f32};

let data: Vec<f32> = (0..256).map(|i| i as f32 / 10.0).collect();
let quantized = quantize_q4_k(&data);
let restored = dequantize_q4_k_to_f32(&quantized, 256);

Constants§

F16_MIN_NORMAL
Minimum valid f16 normal value (~6.1e-5) Prevents NaN on round-trip through f16 encoding
Q4_K_BLOCK_BYTES
Q4_K super-block byte size
Q4_K_BLOCK_SIZE
Q4_K super-block size (elements per block)
Q5_K_BLOCK_BYTES
Q5_K super-block byte size
Q5_K_BLOCK_SIZE
Q5_K super-block size (elements per block)
Q6_K_BLOCK_BYTES
Q6_K super-block byte size
Q6_K_BLOCK_SIZE
Q6_K super-block size (elements per block)

Functions§

dequantize_q4_k_to_f32
Dequantize Q4_K bytes to F32
dequantize_q5_k_to_f32
Dequantize Q5_K bytes to F32
dequantize_q6_k_to_f32
Dequantize Q6_K bytes to F32
f16_to_f32
Convert f16 to f32 (using half crate)
f32_to_f16
Convert f32 to f16 (using half crate)
quantize_q4_k
Quantize F32 data to Q4_K format (llama.cpp/candle compatible)
quantize_q4_k_matrix
Quantize F32 matrix to Q4_K format with proper row layout
quantize_q5_k
Quantize F32 data to Q5_K format
quantize_q5_k_matrix
Quantize F32 matrix to Q5_K format with proper row layout
quantize_q6_k
Quantize F32 data to Q6_K format (candle/GGUF compatible)
quantize_q6_k_matrix
Quantize F32 matrix to Q6_K format with proper row layout
transpose_q4k_for_matmul
Transpose Q4K tensor from GGUF column-major to APR row-major layout
transpose_q5k_for_matmul
Transpose Q5K tensor from GGUF column-major to APR row-major layout
transpose_q6k_for_matmul
Transpose Q6K tensor from GGUF column-major to APR row-major layout