1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
//! # iqdb-eval
//!
//! Index-agnostic evaluation harness for the HiveDB **iqdb** vector-database
//! spine. Measures **recall@k** and **per-query latency percentiles** for
//! any type that implements [`iqdb_index::IndexCore`].
//!
//! ## Surface
//!
//! All measurements are top-level free functions generic over the index
//! under test, so a single harness call works against
//! [`iqdb_flat::FlatIndex`], an HNSW index, or any future index that
//! implements the same trait:
//!
//! - [`recall_at_k`] — recall@k for an index against an externally
//! supplied `Vec<Vec<u32>>` ground truth (typically loaded from a SIFT
//! `.ivecs` file via [`read_ivecs`]).
//! - [`recall_at_k_vs_oracle`] — convenience wrapper that takes a second
//! `IndexCore` (typically [`iqdb_flat::FlatIndex`]) as the oracle and
//! computes ground truth on the fly.
//! - [`compute_ground_truth`] — the oracle-only half: returns the per-query
//! ground-truth ids as `Vec<Vec<u32>>`, matching the `.ivecs` shape.
//! - [`latency`] — collect per-query wall-clock samples and report
//! mean / min / max / p50 / p95 / p99 (nearest-rank) and single-thread QPS.
//! - [`build_index_from_base`] — build a fresh index from a `&[Vec<f32>]`
//! base set, inserting each row at `VectorId::U64(row_index)` so the
//! resulting ids align with `.ivecs` ground-truth files.
//! - [`read_fvecs`] / [`read_ivecs`] / [`load_sift_dataset`] — TEXMEX
//! SIFT-family loaders.
//!
//! ## Correctness invariants
//!
//! - **Row-index ↔ `VectorId::U64`.** [`build_index_from_base`] inserts each
//! row of the base set at `VectorId::U64(row_index as u64)`. Callers that
//! build oracles or indexes by hand must do the same; otherwise ids in
//! `.ivecs` ground-truth cannot match the ids returned by `search`.
//! - **Latency excludes build cost.** [`latency`] takes a borrowed
//! `&I`, so the index is constructed (and therefore paid for) before
//! timing begins.
//! - **Percentiles are nearest-rank.** No interpolation; every reported
//! percentile is an observed sample. See [`LatencyReport`].
//! - **Metric is read from the oracle.** [`compute_ground_truth`] derives
//! the metric from `oracle.metric()` so a mismatched metric cannot
//! silently corrupt the ground-truth set.
//!
//! ## Example
//!
//! ```
//! use iqdb_eval::{
//! build_index_from_base, latency, recall_at_k_vs_oracle, LatencyConfig,
//! };
//! use iqdb_flat::{FlatConfig, FlatIndex};
//! use iqdb_types::{DistanceMetric, SearchParams};
//!
//! let base: Vec<Vec<f32>> = vec![
//! vec![0.0, 0.0],
//! vec![3.0, 4.0],
//! vec![1.0, 1.0],
//! ];
//! let queries: Vec<Vec<f32>> = vec![vec![0.5, 0.5]];
//!
//! let target: FlatIndex =
//! build_index_from_base(FlatConfig, 2, DistanceMetric::Euclidean, &base)?;
//! let oracle: FlatIndex =
//! build_index_from_base(FlatConfig, 2, DistanceMetric::Euclidean, &base)?;
//! let params = SearchParams::new(2, DistanceMetric::Euclidean);
//!
//! let recall = recall_at_k_vs_oracle(&target, &oracle, &queries, ¶ms)?;
//! assert_eq!(recall.mean_recall, 1.0);
//!
//! let lat = latency(&target, &queries, ¶ms, &LatencyConfig::default())?;
//! assert!(lat.p50_us <= lat.p95_us);
//! # Ok::<(), iqdb_eval::EvalError>(())
//! ```
use Arc;
use Index;
use ;
pub use crate;
pub use crate;
pub use crate;
pub use crate;
pub use crate;
/// The version of this crate, taken from `Cargo.toml` at compile time.
///
/// # Examples
///
/// ```
/// let v = iqdb_eval::VERSION;
/// assert_eq!(v.split('.').count(), 3);
/// ```
pub const VERSION: &str = env!;
/// Build a fresh index from a `&[Vec<f32>]` base set, inserting each row
/// at `VectorId::U64(row_index)`.
///
/// This is the harness's canonical way to construct both **the index under
/// test** and **the oracle** so the ids returned by `search` align with
/// row indices stored in `.ivecs` ground-truth files. The function is
/// generic over [`Index`], so any concrete index that implements the trait
/// (flat, HNSW, …) works.
///
/// Each base row is cloned into a fresh `Arc<[f32]>` so it can be handed to
/// [`iqdb_index::IndexCore::insert`]; that allocation is **O(N · dim)** and
/// is unavoidable given `insert`'s `Arc<[f32]>` signature.
///
/// # Errors
///
/// - [`EvalError::EmptyInput`] when `base` is empty.
/// - [`EvalError::DimensionMismatch`] when any row's `len()` differs from
/// `dim`.
/// - [`EvalError::Search`] when [`Index::new`] or
/// [`iqdb_index::IndexCore::insert`] returns an [`iqdb_types::IqdbError`].
///
/// # Examples
///
/// ```
/// use iqdb_eval::build_index_from_base;
/// use iqdb_flat::{FlatConfig, FlatIndex};
/// use iqdb_index::IndexCore;
/// use iqdb_types::DistanceMetric;
///
/// let base: Vec<Vec<f32>> = vec![vec![0.0, 0.0], vec![3.0, 4.0]];
/// let idx: FlatIndex =
/// build_index_from_base(FlatConfig, 2, DistanceMetric::Euclidean, &base)?;
/// assert_eq!(idx.len(), 2);
/// # Ok::<(), iqdb_eval::EvalError>(())
/// ```