Expand description
Vector quantization for memory-efficient storage
This module provides quantization methods to reduce memory footprint while maintaining acceptable search quality:
- Scalar Quantization (SQ8): f32 → i8 (4x compression, ~95% recall)
- Binary Quantization (BQ): f32 → bit (32x compression, ~85% recall)
§Memory Comparison (1M vectors × 384 dims)
| Format | Size | Compression |
|---|---|---|
| f32 | 1.5 GB | 1x |
| int8 | 384 MB | 4x |
| binary | 48 MB | 32x |
§Usage
use foxstash_core::vector::quantize::{ScalarQuantizer, BinaryQuantizer, Quantizer};
let vector = vec![0.5, -0.3, 0.8, -0.1];
// Scalar quantization (4x compression)
let sq = ScalarQuantizer::fit(&[vector.clone()]);
let quantized = sq.quantize(&vector);
let reconstructed = sq.dequantize(&quantized);
// Binary quantization (32x compression)
let bq = BinaryQuantizer::new(4);
let binary = bq.quantize(&vector);Structs§
- Binary
Quantized Vector - Quantized vector representation for binary quantization
- Binary
Quantizer - Binary quantization: f32 → bit
- Product
Quantizer Config - Product Quantization configuration
- Scalar
Quantization Params - Scalar quantization parameters for a single dimension
- Scalar
Quantized Vector - Quantized vector representation for SQ8
- Scalar
Quantizer - Scalar quantization (SQ8): f32 → u8
Traits§
- Quantizer
- Trait for vector quantization
Functions§
- binary_
dot_ product - Compute dot product between f32 query and binary quantized vector
- hamming_
distance_ simd - SIMD-accelerated Hamming distance for binary vectors
- sq8_
asymmetric_ l2_ distance_ simd - SIMD-accelerated asymmetric L2 distance (f32 query vs SQ8 database)
- sq8_
l2_ distance_ simd - SIMD-accelerated L2 distance for SQ8 vectors