Expand description
§iqdb-quantize
Vector quantization for the iqdb embedded vector-database spine. The
crate compresses f32 embedding vectors into compact codes that preserve
similarity-search quality. It ships three schemes behind one trait:
ScalarQuantizer— scalar quantization (SQ8, 4× compression). Per-dimension affine calibration learned from a training sample; codes areu8. Asymmetric distance dequantizes the candidate to a temporary buffer and routes throughiqdb_distance::computefor every metric.BinaryQuantizer— binary quantization (BQ, 32× compression). One bit per dimension thresholded against a trained per-dimension mean; codes are packed intou64words. Hamming distance is computed directly on the packed codes via XOR + popcount. BQ supportsDistanceMetric::Hammingonly; other metrics returnIqdbError::InvalidMetric.ProductQuantizer— product quantization (PQ, configurable compression —Mbytes per code, e.g.M = 16shrinks a 768-dimf32vector from 3072 bytes to 16). Splits each vector intoMsubvectors and learns aK-centroid codebook per position via deterministic k-means (k-means++ seeding, Lloyd’s iterations, seeded byProductQuantizer::seed). Asymmetric distance computation (ADC) precomputes per-subvector distance tables and scores codes by table lookup + summation. PQ supportsDistanceMetric::Euclidean,DistanceMetric::DotProduct, andDistanceMetric::Manhattan;DistanceMetric::Cosine(no global norm) andDistanceMetric::Hamming(wrong code space) returnIqdbError::InvalidMetric.
Every method of the Quantizer trait is fallible and returns
iqdb_types::Result. The library never panics on bad input.
§How to use quantization correctly
Quantization is lossy by design. Two rules:
- Train on representative data. Per-dimension calibration is only as good as the sample it was learned from. Train on the embeddings you intend to index, not a synthetic placeholder.
- Search quantized, rerank with full
f32. Quantized distance narrows the candidate set cheaply; the final ranking should use the originalf32vectors. Skipping the rerank step is the most common cause of “quantization broke recall” reports.
§Example
use iqdb_quantize::{Quantizer, ScalarQuantizer};
use iqdb_types::DistanceMetric;
let training = [
vec![0.10_f32, 0.20, 0.30],
vec![0.15, 0.18, 0.32],
vec![0.12, 0.22, 0.28],
];
let refs: Vec<&[f32]> = training.iter().map(Vec::as_slice).collect();
let mut sq = ScalarQuantizer::new();
sq.train(&refs).expect("non-empty, consistent dims, finite values");
let code = sq.quantize(&[0.11, 0.21, 0.29]).expect("dim matches training");
let d = sq
.distance(&[0.10, 0.20, 0.30], &code, DistanceMetric::Cosine)
.expect("dim matches");
assert!(d.is_finite());§Errors
Every fallible call returns iqdb_types::Result. Empty or non-finite
inputs surface as IqdbError::InvalidVector; dimension drift as
IqdbError::DimensionMismatch; calling a hot method before
Quantizer::train returns IqdbError::InvalidConfig; a non-Hamming
metric against BinaryQuantizer or an unsupported metric
(DistanceMetric::Cosine, DistanceMetric::Hamming) against
ProductQuantizer returns IqdbError::InvalidMetric.
Structs§
- Binary
Quantizer - Binary quantizer (BQ): one bit per dimension, 32× compression.
- BqCode
- A binary-quantized (BQ) code: one bit per dimension, packed into
u64words. - PqAdc
Tables - Per-
(query, metric)precomputed ADC lookup tables built from aProductQuantizer. - PqCode
- A product-quantized (PQ) code: one
u8centroid index per subvector. - Product
Quantizer - Product quantizer:
Msubvectors ×Kcentroids per subvector. - Scalar
Quantizer - Scalar quantizer (SQ8): one
u8per dimension, 4× compression. - Sq8Code
- A scalar-quantized (SQ8) code: one
u8per dimension of the trained vector space.
Constants§
- VERSION
- The version of this crate, taken from
Cargo.tomlat compile time.
Traits§
- Quantizer
- A vector quantizer.