Skip to main content

Module quantization

Module quantization 

Source
Expand description

Scalar Quantization (SQ8) and Binary Quantization for memory-efficient vector storage.

This module implements quantization strategies to reduce memory usage:

§Benefits

Metricf32SQ8Binary
RAM/vector (768d)3 KB770 bytes96 bytes
Cache efficiencyBaseline~4x better~32x better
Recall loss0%~0.5-1%~5-10%

Structs§

BinaryQuantizedVector
A binary quantized vector using 1-bit per dimension.
PQCodebook
Per-subspace centroid tables learned with k-means.
PQVector
Compressed representation of a vector: one centroid id per subspace.
ProductQuantizer
Product quantizer model and helpers for train/encode/decode.
QuantizedVector
A quantized vector using 8-bit scalar quantization.

Enums§

StorageMode
Storage mode for vectors.

Functions§

cosine_similarity_quantized
Computes approximate cosine similarity between a query (f32) and quantized vector.
cosine_similarity_quantized_simd
SIMD-optimized cosine similarity between f32 query and SQ8 vector.
distance_pq
Asymmetric distance computation (ADC): query is f32, candidate is PQ-coded.
dot_product_quantized
Computes the approximate dot product between a query vector (f32) and a quantized vector.
dot_product_quantized_simd
SIMD-optimized dot product between f32 query and SQ8 quantized vector.
euclidean_squared_quantized
Computes the approximate squared Euclidean distance between a query (f32) and quantized vector.
euclidean_squared_quantized_simd
SIMD-optimized squared Euclidean distance between f32 query and SQ8 vector.