Module quantization

Module quantization 

Source
Expand description

Vector Quantization for Memory Optimization

Provides scalar (8-bit), 4-bit, and binary (1-bit) quantization.

§Scalar Quantization (8-bit)

Compresses float32 vectors to int8/uint8, reducing memory usage by ~4x.

Benefits:

  • Memory: 4x reduction (float32 → uint8)
  • Speed: Faster distance computations with SIMD
  • Scalability: Fit 4x more vectors in memory

Trade-offs:

  • Small accuracy loss (~1-2% recall degradation)
  • One-time quantization cost during build

§4-bit Quantization

Compresses float32 vectors to 4-bit, reducing memory usage by ~8x.

Benefits:

  • Memory: 8x reduction (float32 → 4-bit)
  • Speed: Fast distance computations with nibble packing
  • Scalability: Fit 8x more vectors in memory
  • Sweet Spot: Best balance between memory and accuracy

Trade-offs:

  • Moderate accuracy loss (~2-4% recall degradation)
  • Nibble packing/unpacking overhead

§Binary Quantization (1-bit)

Compresses float32 vectors to 1-bit, reducing memory usage by ~32x.

Benefits:

  • Memory: 32x reduction (float32 → 1-bit)
  • Speed: Extremely fast Hamming distance with bitwise operations
  • Scalability: Fit 32x more vectors in memory

Trade-offs:

  • Higher accuracy loss (~5-10% recall degradation)
  • Best for high-dimensional vectors (>128 dims)

§FP16 (Half-Precision) Quantization

Compresses float32 vectors to float16 (16-bit), reducing memory usage by 2x.

Benefits:

  • Memory: 2x reduction (float32 → float16)
  • Accuracy: Minimal accuracy loss (<0.1% recall degradation)
  • Speed: No quantization overhead, direct float16 operations
  • Hardware Support: Native support on modern CPUs/GPUs

Trade-offs:

  • Lower compression ratio than 8-bit/4-bit/binary quantization
  • Requires FP16 hardware support for maximum performance

§Example

use oxify_vector::quantization::{ScalarQuantizer, QuantizationConfig};

let config = QuantizationConfig::default();
let mut quantizer = ScalarQuantizer::new(config);

// Fit quantizer to data
let vectors = vec![vec![1.0, 2.0, 3.0], vec![4.0, 5.0, 6.0]];
quantizer.fit(&vectors);

// Quantize a vector
let quantized = quantizer.quantize(&[1.5, 2.5, 3.5]);
assert_eq!(quantized.len(), 3);

// Dequantize back to floats
let dequantized = quantizer.dequantize(&quantized);
assert_eq!(dequantized.len(), 3);

Structs§

BinaryQuantizationConfig
Binary quantization configuration
BinaryQuantizedIndex
Binary quantized vector index for extreme memory efficiency
BinaryQuantizedIndexStats
Statistics for binary quantized index
BinaryQuantizer
Binary quantizer for extreme memory compression (32x reduction)
FourBitQuantizedIndex
4-bit quantized vector index for balanced memory efficiency
FourBitQuantizedIndexStats
Statistics for 4-bit quantized index
FourBitQuantizer
4-bit quantizer for balanced memory/accuracy trade-off (8x compression)
QuantizationConfig
Quantization configuration
QuantizedIndexStats
Statistics for quantized index
QuantizedVectorIndex
Quantized vector index for memory-efficient search
ScalarQuantizer
Scalar quantizer for compressing float32 vectors