Expand description
Scalar Quantization (SQ8) for memory-efficient vector storage.
This module implements 8-bit scalar quantization to reduce memory usage by 4x while maintaining >95% recall accuracy.
§Benefits
| Metric | f32 | SQ8 |
|---|---|---|
| RAM/vector (768d) | 3 KB | 770 bytes |
| Cache efficiency | Baseline | ~4x better |
| Recall loss | 0% | ~0.5-1% |
Structs§
- Binary
Quantized Vector - A binary quantized vector using 1-bit per dimension.
- Quantized
Vector - A quantized vector using 8-bit scalar quantization.
Enums§
- Storage
Mode - Storage mode for vectors.
Functions§
- cosine_
similarity_ quantized - Computes approximate cosine similarity between a query (f32) and quantized vector.
- dot_
product_ quantized - Computes the approximate dot product between a query vector (f32) and a quantized vector.
- euclidean_
squared_ quantized - Computes the approximate squared Euclidean distance between a query (f32) and quantized vector.