Skip to main content

Module quantization

Module quantization 

Source
Expand description

Quantization techniques for memory compression

This module provides tiered quantization strategies as specified in ADR-001:

QuantizationCompressionUse Case
Scalar (u8)4xWarm data (40-80% access)
Int48xCool data (10-40% access)
Product8-16xCold data (1-10% access)
Binary32xArchive (<1% access)

§Performance Optimizations v2

  • SIMD-accelerated distance calculations for scalar (int8) quantization
  • SIMD popcnt for binary hamming distance
  • 4x loop unrolling for better instruction-level parallelism
  • Separate accumulator strategy to reduce data dependencies

Structs§

BinaryQuantized
Binary quantization (32x compression)
Int4Quantized
Int4 quantization (8x compression)
ProductQuantized
Product quantization (8-16x compression)
ScalarQuantized
Scalar quantization to int8 (4x compression)

Traits§

QuantizedVector
Trait for quantized vector representations