Skip to main content

Module quantize

Module quantize 

Source
Expand description

Scalar quantization (SQ8) for f32 → i8 vector compression.

Per-vector symmetric quantization: each vector gets its own scale and zero_point, computed from its min/max values. This gives 4x memory reduction with <0.5% cosine similarity error on normalized embedding vectors.

This module is independent of the HNSW backend and can be used with brute-force too.

Structs§

QuantizedVector
Scalar quantization parameters for a single vector.
Quantizer
Quantizer that converts f32 vectors to int8 with per-vector calibration.

Functions§

pack_quantized
Pack a QuantizedVector into bytes for SQLite storage.
unpack_quantized
Unpack bytes from SQLite into a QuantizedVector.