Expand description
§Quantization Error Calibration (Task 5)
Calibrates quantization error per-list (or per-cluster) and uses it in stopping decisions.
§Problem
With PQ/ADC, the kth score is a proxy score, not the true score. Stopping on list bounds vs kth requires them to be comparable in the true metric.
§Solution
Learn empirical error envelopes:
- Per-list quantiles for ε = ŝ - s under representative queries
- At query time, convert proxy thresholds into safe true-score thresholds
§Math/Algorithm
PAC-style calibration:
- Store ε_L(1-δ) such that P(ε ≤ ε_L) ≥ 1-δ
- Stopping compares LB_true(list) vs UB_true(kth) using these envelopes
§Usage
ⓘ
use sochdb_vector::quantization_calibration::{ErrorCalibrator, ErrorEnvelope};
// During offline training
let mut calibrator = ErrorCalibrator::new(n_lists);
calibrator.record_error(list_idx, proxy_score, true_score);
let envelopes = calibrator.finalize();
// At query time
let proxy_kth = 0.85;
let safe_threshold = envelopes.safe_true_threshold(list_idx, proxy_kth, 0.99);Structs§
- Calibration
Runner - Runs calibration using representative queries
- Error
Calibrator - Collects error samples and computes envelopes
- Error
Envelope - Pre-computed error envelope for a list
- Error
Envelope Set - Collection of error envelopes for all lists
- Error
Sample - A single error sample: ε = proxy - true