Expand description
RaBitQ: Rotation-Based 1-bit Quantization for Approximate Nearest-Neighbor Search
Motivated by the SIGMOD 2024 algorithm by Jianyang Gao & Cheng Long: “RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search”. This crate ships two estimators:
-
Symmetric (Charikar-style) — both query and database are 1-bit.
est_cos = cos(π · (1 − B/D))where B = padding-safe XNOR-popcount. Cheapest per candidate (O(D/64) popcount + 1 cos) — use when memory dominates throughput. -
Asymmetric (RaBitQ-2024-style) — query is f32, database is 1-bit.
est_ip = ‖q‖ · ‖x‖ · (1/√D) · Σᵢ sign(x_rot,i) · q_rot,i. Unbiased on Haar-uniform rotations; tighter variance than (1). O(D) per candidate.
§Indexes
| Variant | Storage | Estimator | Rerank |
|---|---|---|---|
FlatF32Index | f32 originals | exact L2 | N/A |
RabitqIndex | rotation + codes | symmetric | no |
RabitqPlusIndex | rotation + codes + originals | symmetric + exact rerank | yes |
RabitqAsymIndex | rotation + codes [+ originals] | asymmetric + optional rerank | opt |
All satisfy index::AnnIndex. Search is top-k via a bounded max-heap
(O(n log k)), and scoring uses f32::total_cmp so NaN never panics.
§Guarantees
- Padding-safe popcount at any D (handles
D % 64 != 0correctly). - Deterministic:
(dim, seed, data)triple → bit-identical rotation + index build + search output across runs. - No
unsafe, no external BLAS/LAPACK, no C/C++ deps.
§Benchmarks
See benches/rabitq_bench.rs for distance-kernel micro-benchmarks and
src/main.rs (rabitq-demo) for end-to-end recall + throughput across
n ∈ {1 k, 5 k, 50 k, 100 k}.
Re-exports§
pub use error::RabitqError;pub use index::AnnIndex;pub use index::FlatF32Index;pub use index::RabitqAsymIndex;pub use index::RabitqIndex;pub use index::RabitqPlusIndex;pub use index::SearchResult;pub use kernel::CpuKernel;pub use kernel::KernelCaps;pub use kernel::ScanRequest;pub use kernel::ScanResponse;pub use kernel::VectorKernel;pub use quantize::pack_bits;pub use quantize::unpack_bits;pub use quantize::BinaryCode;pub use rotation::RandomRotation;pub use rotation::RandomRotationKind;
Modules§
- error
- index
- RaBitQ flat indexes — four search backends behind one trait:
- kernel
VectorKerneltrait — the pluggable execution backend for RaBitQ scan + rerank. Defined here (ADR-157 §“Where each piece lives”) because kernels are RaBitQ primitives; the cache is a consumer.- persist
- Persistence for
RabitqPlusIndex— seed-based reconstruction format. - quantize
- Bit-packing, XNOR-popcount, and the two RaBitQ distance estimators:
- rotation
- Random orthogonal rotation.
- scan
- SIMD-accelerated symmetric-scan agreement-count kernel.