Expand description
Scalar quantization (SQ8) for f32 → i8 vector compression.
Per-vector symmetric quantization: each vector gets its own scale and zero_point, computed from its min/max values. This gives 4x memory reduction with <0.5% cosine similarity error on normalized embedding vectors.
This module is independent of the HNSW backend and can be used with brute-force too.
Structs§
- Quantized
Vector - Scalar quantization parameters for a single vector.
- Quantizer
- Quantizer that converts f32 vectors to int8 with per-vector calibration.
Functions§
- pack_
quantized - Pack a QuantizedVector into bytes for SQLite storage.
- unpack_
quantized - Unpack bytes from SQLite into a QuantizedVector.