Module quantization

Module quantization 

Source
Expand description

Scalar Quantization (SQ8) for memory-efficient vector storage.

This module implements 8-bit scalar quantization to reduce memory usage by 4x while maintaining >95% recall accuracy.

§Benefits

Metricf32SQ8
RAM/vector (768d)3 KB770 bytes
Cache efficiencyBaseline~4x better
Recall loss0%~0.5-1%

Structs§

BinaryQuantizedVector
A binary quantized vector using 1-bit per dimension.
QuantizedVector
A quantized vector using 8-bit scalar quantization.

Enums§

StorageMode
Storage mode for vectors.

Functions§

cosine_similarity_quantized
Computes approximate cosine similarity between a query (f32) and quantized vector.
cosine_similarity_quantized_simd
SIMD-optimized cosine similarity between f32 query and SQ8 vector.
dot_product_quantized
Computes the approximate dot product between a query vector (f32) and a quantized vector.
dot_product_quantized_simd
SIMD-optimized dot product between f32 query and SQ8 quantized vector.
euclidean_squared_quantized
Computes the approximate squared Euclidean distance between a query (f32) and quantized vector.
euclidean_squared_quantized_simd
SIMD-optimized squared Euclidean distance between f32 query and SQ8 vector.