ruvector_rabitq/lib.rs
1#![allow(clippy::manual_div_ceil)]
2#![allow(clippy::needless_range_loop)]
3#![allow(clippy::doc_overindented_list_items)]
4
5//! RaBitQ: Rotation-Based 1-bit Quantization for Approximate Nearest-Neighbor Search
6//!
7//! Motivated by the SIGMOD 2024 algorithm by Jianyang Gao & Cheng Long:
8//! *"RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound
9//! for Approximate Nearest Neighbor Search"*. This crate ships two estimators:
10//!
11//! 1. **Symmetric (Charikar-style)** — both query and database are 1-bit.
12//! `est_cos = cos(π · (1 − B/D))` where B = padding-safe XNOR-popcount.
13//! Cheapest per candidate (O(D/64) popcount + 1 cos) — use when memory
14//! dominates throughput.
15//!
16//! 2. **Asymmetric (RaBitQ-2024-style)** — query is f32, database is 1-bit.
17//! `est_ip = ‖q‖ · ‖x‖ · (1/√D) · Σᵢ sign(x_rot,i) · q_rot,i`. Unbiased on
18//! Haar-uniform rotations; tighter variance than (1). O(D) per candidate.
19//!
20//! ## Indexes
21//!
22//! | Variant | Storage | Estimator | Rerank |
23//! |---|---|---|---|
24//! | `FlatF32Index` | f32 originals | exact L2 | N/A |
25//! | `RabitqIndex` | rotation + codes | symmetric | no |
26//! | `RabitqPlusIndex` | rotation + codes + originals | symmetric + exact rerank | yes |
27//! | `RabitqAsymIndex` | rotation + codes [+ originals] | asymmetric + optional rerank | opt |
28//!
29//! All satisfy [`index::AnnIndex`]. Search is top-k via a bounded max-heap
30//! (O(n log k)), and scoring uses `f32::total_cmp` so NaN never panics.
31//!
32//! ## Guarantees
33//!
34//! - Padding-safe popcount at any D (handles `D % 64 != 0` correctly).
35//! - Deterministic: `(dim, seed, data)` triple → bit-identical rotation +
36//! index build + search output across runs.
37//! - No `unsafe`, no external BLAS/LAPACK, no C/C++ deps.
38//!
39//! ## Benchmarks
40//!
41//! See `benches/rabitq_bench.rs` for distance-kernel micro-benchmarks and
42//! `src/main.rs` (`rabitq-demo`) for end-to-end recall + throughput across
43//! n ∈ {1 k, 5 k, 50 k, 100 k}.
44
45pub mod error;
46pub mod index;
47pub mod kernel;
48pub mod persist;
49pub mod quantize;
50pub mod rotation;
51pub mod scan;
52
53pub use error::RabitqError;
54pub use index::{
55 AnnIndex, FlatF32Index, RabitqAsymIndex, RabitqIndex, RabitqPlusIndex, SearchResult,
56};
57pub use kernel::{CpuKernel, KernelCaps, ScanRequest, ScanResponse, VectorKernel};
58pub use quantize::{pack_bits, unpack_bits, BinaryCode};
59pub use rotation::{RandomRotation, RandomRotationKind};