Skip to main content

diskann_quantization/minmax/
mod.rs

1/*
2 * Copyright (c) Microsoft Corporation.
3 * Licensed under the MIT license.
4 */
5
6//! # MinMax Quantization
7//!
8//! MinMax quantization provides memory-efficient vector compression by converting
9//! floating-point values to small n-bit integers on a per-vector basis.
10//!
11//! ## Core Concept
12//!
13//! Each vector is independently quantized using the formula:
14//! ```math
15//! X' = round((X - s) * (2^n - 1) / c).clamp(0, 2^n - 1)
16//! ```
17//! where `s` is a shift value and `c` is a scaling parameter computed from the
18//! range of values.
19//!
20//! For most bit widths (>1), given a positive scaling parameter `grid_scale : f32`,
21//! these are computed as:
22//! ```math
23//! - m = (max_i X[i] + min_i X[i]) / 2.0
24//! - w = max_i X[i] - min_i X[i]
25//!
26//! - s = m - w * grid_scale
27//! - c = 2 * w * grid_scale
28//! ```
29//! For 1-bit quantization, to avoid outliers, `s` and `c` are derived differently:
30//!   i) Values are first split into two groups: those below and above the mean.
31//!  ii) `s` is the average of values below the mean.
32//! iii) `c` is the difference between the average of values above the mean and `s`.
33//!
34//! This encoding is similar to scalar quantization, but, since both 's' and 'c'
35//! are computed on a per-vector basis, this allows this quantization mechanism
36//! to be applied in a **streaming setting**; making it qualitatively different
37//! than scalar quantization.
38//!
39//! ## Module Components
40//!
41//! - [`MinMaxQuantizer`]: Handles vector encoding and decoding
42//! - [`Data`]: Stores quantized vectors with compensation parameters
43//! - Distance functions:
44//!   - [`MinMaxIP`]: Inner product distance for quantized vectors.
45//!   - [`MinMaxL2Squared`]: L2 (Euclidean) distance for quantized vectors.
46//!   - [`MinMaxCosine`]: Cosine similarity for quantized vectors.
47//!   - [`MinMaxCosineNormalized`]: Cosine similarity for quantized vectors assuming the
48//!     original full-precision vectors were normalized.
49//!
50//! To reconstruct the original vector, the inverse operation is applied:
51//! ```math
52//! X = X' * c / (2^n - 1) + s
53//! ```
54mod multi;
55mod quantizer;
56mod recompress;
57mod vectors;
58
59/////////////
60// Exports //
61/////////////
62
63pub use multi::{MinMaxKernel, MinMaxMeta};
64pub use quantizer::{L2Loss, MinMaxQuantizer};
65pub use recompress::{RecompressError, Recompressor};
66pub use vectors::{
67    Data, DataMutRef, DataRef, DecompressError, FullQuery, FullQueryMeta, FullQueryMut,
68    FullQueryRef, MetaParseError, MinMaxCompensation, MinMaxCosine, MinMaxCosineNormalized,
69    MinMaxIP, MinMaxL2Squared,
70};