Module quantize

Expand description

Scalar quantization (SQ8) for f32 → i8 vector compression.

Per-vector symmetric quantization: each vector gets its own scale and zero_point, computed from its min/max values. This gives 4x memory reduction with <0.5% cosine similarity error on normalized embedding vectors.

This module is independent of the HNSW backend and can be used with brute-force too.

Structs§

QuantizedVector: Scalar quantization parameters for a single vector.
Quantizer: Quantizer that converts f32 vectors to int8 with per-vector calibration.

Functions§

pack_quantized: Pack a QuantizedVector into bytes for SQLite storage.
unpack_quantized: Unpack bytes from SQLite into a QuantizedVector.

Module quantize

Module quantize Copy item path

Structs§

Functions§

Module quantize