Skip to main content

Crate iqdb_eval

Crate iqdb_eval 

Source
Expand description

§iqdb-eval

Index-agnostic evaluation harness for the HiveDB iqdb vector-database spine. Measures recall@k and per-query latency percentiles for any type that implements iqdb_index::IndexCore.

§Surface

All measurements are top-level free functions generic over the index under test, so a single harness call works against iqdb_flat::FlatIndex, an HNSW index, or any future index that implements the same trait:

  • recall_at_k — recall@k for an index against an externally supplied Vec<Vec<u32>> ground truth (typically loaded from a SIFT .ivecs file via read_ivecs).
  • recall_at_k_vs_oracle — convenience wrapper that takes a second IndexCore (typically iqdb_flat::FlatIndex) as the oracle and computes ground truth on the fly.
  • compute_ground_truth — the oracle-only half: returns the per-query ground-truth ids as Vec<Vec<u32>>, matching the .ivecs shape.
  • latency — collect per-query wall-clock samples and report mean / min / max / p50 / p95 / p99 (nearest-rank) and single-thread QPS.
  • build_index_from_base — build a fresh index from a &[Vec<f32>] base set, inserting each row at VectorId::U64(row_index) so the resulting ids align with .ivecs ground-truth files.
  • read_fvecs / read_ivecs / load_sift_dataset — TEXMEX SIFT-family loaders.

§Correctness invariants

  • Row-index ↔ VectorId::U64. build_index_from_base inserts each row of the base set at VectorId::U64(row_index as u64). Callers that build oracles or indexes by hand must do the same; otherwise ids in .ivecs ground-truth cannot match the ids returned by search.
  • Latency excludes build cost. latency takes a borrowed &I, so the index is constructed (and therefore paid for) before timing begins.
  • Percentiles are nearest-rank. No interpolation; every reported percentile is an observed sample. See LatencyReport.
  • Metric is read from the oracle. compute_ground_truth derives the metric from oracle.metric() so a mismatched metric cannot silently corrupt the ground-truth set.

§Example

use iqdb_eval::{
    build_index_from_base, latency, recall_at_k_vs_oracle, LatencyConfig,
};
use iqdb_flat::{FlatConfig, FlatIndex};
use iqdb_types::{DistanceMetric, SearchParams};

let base: Vec<Vec<f32>> = vec![
    vec![0.0, 0.0],
    vec![3.0, 4.0],
    vec![1.0, 1.0],
];
let queries: Vec<Vec<f32>> = vec![vec![0.5, 0.5]];

let target: FlatIndex =
    build_index_from_base(FlatConfig, 2, DistanceMetric::Euclidean, &base)?;
let oracle: FlatIndex =
    build_index_from_base(FlatConfig, 2, DistanceMetric::Euclidean, &base)?;
let params = SearchParams::new(2, DistanceMetric::Euclidean);

let recall = recall_at_k_vs_oracle(&target, &oracle, &queries, &params)?;
assert_eq!(recall.mean_recall, 1.0);

let lat = latency(&target, &queries, &params, &LatencyConfig::default())?;
assert!(lat.p50_us <= lat.p95_us);

Structs§

LatencyConfig
Controls how latency runs its measurement loop.
LatencyReport
Summary of a per-query latency measurement.
RecallReport
Summary of a recall@k measurement against a known or computed ground-truth set.
SiftDataset
One full SIFT-family dataset: base vectors, query vectors, per-query ground-truth neighbour ids, and the shared dimensionality.

Enums§

EvalError
An error from an iqdb-eval measurement or dataset-loading operation.

Constants§

VERSION
The version of this crate, taken from Cargo.toml at compile time.

Functions§

build_index_from_base
Build a fresh index from a &[Vec<f32>] base set, inserting each row at VectorId::U64(row_index).
compute_ground_truth
Compute per-query top-k ground truth using oracle.
latency
Measure per-query latency for index over queries and return a LatencyReport.
load_sift_dataset
Load a SIFT-family dataset rooted at root and named by prefix.
read_fvecs
Read a .fvecs file (TEXMEX corpus format) into one Vec<f32> per record.
read_ivecs
Read an .ivecs file (TEXMEX corpus format) into one Vec<u32> per record.
recall_at_k
Measure recall@k for index against an externally-supplied ground_truth.
recall_at_k_vs_oracle
Convenience wrapper: compute ground truth from oracle, then measure index against it.

Type Aliases§

Result
A specialized Result whose error is EvalError.