Skip to main content

Module quantization_calibration

Module quantization_calibration 

Source
Expand description

§Quantization Error Calibration (Task 5)

Calibrates quantization error per-list (or per-cluster) and uses it in stopping decisions.

§Problem

With PQ/ADC, the kth score is a proxy score, not the true score. Stopping on list bounds vs kth requires them to be comparable in the true metric.

§Solution

Learn empirical error envelopes:

  • Per-list quantiles for ε = ŝ - s under representative queries
  • At query time, convert proxy thresholds into safe true-score thresholds

§Math/Algorithm

PAC-style calibration:

  • Store ε_L(1-δ) such that P(ε ≤ ε_L) ≥ 1-δ
  • Stopping compares LB_true(list) vs UB_true(kth) using these envelopes

§Usage

use sochdb_vector::quantization_calibration::{ErrorCalibrator, ErrorEnvelope};

// During offline training
let mut calibrator = ErrorCalibrator::new(n_lists);
calibrator.record_error(list_idx, proxy_score, true_score);
let envelopes = calibrator.finalize();

// At query time
let proxy_kth = 0.85;
let safe_threshold = envelopes.safe_true_threshold(list_idx, proxy_kth, 0.99);

Structs§

CalibrationRunner
Runs calibration using representative queries
ErrorCalibrator
Collects error samples and computes envelopes
ErrorEnvelope
Pre-computed error envelope for a list
ErrorEnvelopeSet
Collection of error envelopes for all lists
ErrorSample
A single error sample: ε = proxy - true