Skip to main content

Module minmax

Module minmax 

Source
Expand description

§MinMax Quantization

MinMax quantization provides memory-efficient vector compression by converting floating-point values to small n-bit integers on a per-vector basis.

§Core Concept

Each vector is independently quantized using the formula:

X' = round((X - s) * (2^n - 1) / c).clamp(0, 2^n - 1)

where s is a shift value and c is a scaling parameter computed from the range of values.

For most bit widths (>1), given a positive scaling parameter grid_scale : f32, these are computed as:

- m = (max_i X[i] + min_i X[i]) / 2.0
- w = max_i X[i] - min_i X[i]

- s = m - w * grid_scale
- c = 2 * w * grid_scale

For 1-bit quantization, to avoid outliers, s and c are derived differently: i) Values are first split into two groups: those below and above the mean. ii) s is the average of values below the mean. iii) c is the difference between the average of values above the mean and s.

This encoding is similar to scalar quantization, but, since both ‘s’ and ‘c’ are computed on a per-vector basis, this allows this quantization mechanism to be applied in a streaming setting; making it qualitatively different than scalar quantization.

§Module Components

  • MinMaxQuantizer: Handles vector encoding and decoding
  • Data: Stores quantized vectors with compensation parameters
  • Distance functions:
    • MinMaxIP: Inner product distance for quantized vectors.
    • MinMaxL2Squared: L2 (Euclidean) distance for quantized vectors.
    • MinMaxCosine: Cosine similarity for quantized vectors.
    • MinMaxCosineNormalized: Cosine similarity for quantized vectors assuming the original full-precision vectors were normalized.

To reconstruct the original vector, the inverse operation is applied:

X = X' * c / (2^n - 1) + s

Structs§

FullQueryMeta
A meta struct storing the sum and norm_squared of a full query after transformation is applied to it.
MinMaxCompensation
A per-vector precomputed coefficients to help compute inner products and squared L2 distances for the MinMax quantized vectors.
MinMaxCosine
MinMaxCosineNormalized
MinMaxIP
MinMaxKernel
Kernel for computing MaxSim and Chamfer distance using MinMax quantized vectors.
MinMaxL2Squared
MinMaxMeta
Metadata for MinMax quantized multi-vectors.
MinMaxQuantizer
Recall that from the module-level documentation, MinMaxQuantizer, quantizes X into n bit vectors as follows -
Recompressor
Recompression utilities for MinMax quantized vectors.

Enums§

DecompressError
L2Loss
A struct defining euclidean loss from quantization.
MetaParseError
Error type for parsing a slice of bytes as a DataRef and returning corresponding dimension.
RecompressError
Error type for recompression operations.

Type Aliases§

Data
An owning compressed data vector
DataMutRef
A mutable borrowed Data vector
DataRef
A borrowed Data vector
FullQuery
A full precision query.
FullQueryMut
A mutable borrowed full precision query.
FullQueryRef
A borrowed full precision query.