aprender-quant 0.29.0

K-Quantization formats for GGUF/APR model weights (Toyota Way: ONE source of truth)

This crate provides quantization functions for converting F32 data to K-quantization formats (Q4_K, Q5_K, Q6_K). This is the ONLY implementation in the Sovereign AI Stack - aprender and realizar import from here.

Stack Architecture (Toyota Way)

       ┌─────────┐
       │ apr CLI │
       └────┬────┘
            │
    ┌───────┼───────┬───────────┐
    ▼       ▼       ▼           ▼
┌────────┐ ┌────────┐ ┌─────────┐
│entrenar│ │aprender│ │realizar │
└───┬────┘ └───┬────┘ └────┬────┘
    │          │           │
    └────┬─────┴───────────┴────┘
         ▼
      ┌────────────────┐
      │  trueno-quant  │  ← YOU ARE HERE
      └───────┬────────┘
              ▼
      ┌────────────────┐
      │     trueno     │
      └────────────────┘

Format Specifications

Q4_K: 256-element super-blocks, 144 bytes (4.5 bits/weight)
Q5_K: 256-element super-blocks, 176 bytes (5.5 bits/weight)
Q6_K: 256-element super-blocks, 210 bytes (6.5 bits/weight)

Usage

use trueno_quant::{quantize_q4_k, dequantize_q4_k_to_f32};

let data: Vec<f32> = (0..256).map(|i| i as f32 / 10.0).collect();
let quantized = quantize_q4_k(&data);
let restored = dequantize_q4_k_to_f32(&quantized, 256);