vicinity
Approximate nearest-neighbor (ANN) search in Rust. HNSW, NSW, DiskANN, IVF-PQ, ScaNN, PQ, and RaBitQ -- pure Rust, no C/C++ bindings.
[]
= { = "0.2.0", = ["hnsw"] }
Minimal API
Builder pattern (recommended):
use ;
// 1. Create index via builder
let mut index = builder
.m
.ef_search
.auto_normalize
.build?;
// 2. Add vectors (auto-normalized when auto_normalize=true)
index.add_slice?;
index.add_slice?;
// 3. Build graph
index.build?;
// 4. Search (k=1, ef_search=50)
let results = index.search?;
Direct constructor (when you need explicit control over m_max):
use HNSWIndex;
let mut index = new?;
index.add_slice?;
index.add_slice?;
index.build?;
let results = index.search?;
The problem
Given a query vector, find the top-k most similar vectors from a collection. Brute force computes all N distances (O(N) per query). For 1,000,000 vectors, that's 1,000,000 distance computations per query.
ANN systems trade exactness for speed: they aim for high recall at much lower latency.
The key idea (graph search, not magic)
HNSW builds a multi-layer graph where each point has:
- a few long edges (good for jumping across the space)
- and more local edges near the bottom (good for refinement)
A query does a greedy, coarse-to-fine walk:
- start from an entry point at the top layer
- greedily descend toward the query through progressively denser layers
- maintain a candidate set (size
ef_search) at the bottom to avoid getting stuck
A more accurate mental model than “shortcuts” is: HNSW is a cheap way to keep multiple plausible local minima alive until you can locally refine.
Layer 2 (coarse): o---------o
\ /
\ o /
\ | /
Layer 1: o---o---o-o---o---o
\ | /
\ | /
Layer 0 (dense): o--o--o--o--o--o--o--o
^
keep ~ef_search candidates here,
return the best k
Tuning knobs (HNSW)
ef_search (query effort)
In HNSW, ef_search controls how many candidates you keep during the bottom-layer search.
Larger values usually increase recall, at the cost of query time.
Notes:
- Data from GloVe-25 (1.2M vectors, 25-d, cosine). See
doc/benchmark-results.mdfor full numbers. - Results are dataset-specific; do not assume these recall values generalize to other distributions.
Higher ef_search typically improves recall and increases query time. Start around ef_search=50-100
and measure recall@k vs latency for your dataset.
M / graph degree (build-time and memory)
Higher M generally improves recall, but increases build time and memory.
Notes:
- Build-time data from GloVe-25 (1.2M vectors). Memory plot is theoretical (formula: ND4 + NM241.2).
- Treat these as reference points, not a stable performance contract.
Distance semantics (current behavior)
Different components currently assume different distance semantics. This is intentionally surfaced here because it's an easy place to make silent mistakes (e.g. forgetting to normalize vectors).
| Component | Metric | Notes |
|---|---|---|
hnsw::HNSWIndex |
cosine distance | Fast path assumes L2-normalized vectors |
ivf_pq::IVFPQIndex |
cosine distance | Uses dot-based cosine distance for IVF + PQ |
scann::SCANNIndex |
dot (coarse) / cosine (rerank) | Dot product for partition scoring, cosine distance for reranking |
hnsw::dual_branch::DualBranchHNSW |
L2 distance | Experimental implementation |
quantization |
Hamming-like / binary distances | See quantization::simd_ops::hamming_distance and ternary helpers |
Planned direction: make distance semantics explicit via a shared metric/normalization contract so that “same input vectors” means “same meaning” across indexes.
Algorithms
| Type | Implementations | Status |
|---|---|---|
| Graph | HNSW, NSW | Stable |
| Graph | Vamana (DiskANN), SNG, DEG | Experimental |
| Partition | IVF-PQ | Stable |
| Partition | ScaNN | Experimental |
| Quantization | PQ, RaBitQ, SQ8 (scalar) | Stable |
| Tree | KD-Tree, Ball Tree, RP-Forest, K-Means Tree | Experimental |
Beyond basic search
| Capability | API | Notes |
|---|---|---|
| Batch add | index.add_batch(ids, vectors) |
Bulk ingestion from flat f32 slice |
| Delete vectors | index.delete(doc_id) |
Lazy tombstoning; deleted vectors excluded from results |
| Filtered search | index.search_with_filter(query, k, ef, predicate) |
ACORN-style metadata filtering during graph traversal |
| Save / load | index.save_to_writer(w) / HNSWIndex::load_from_reader(r) |
Requires serde feature |
| Batch search | index.search_batch(&queries, k, ef) |
Parallel via rayon; requires parallel feature |
| Scalar quantization | ScalarQuantizedHNSW::new(dim, m, m_max) |
~4x memory reduction (uint8); asymmetric search + optional reranking |
| Streaming updates | StreamingCoordinator::new(index, config) |
Buffer-and-compact architecture for online insert/delete |
| Index factory | index_factory(dim, "HNSW16") |
Faiss-style string-based index construction |
| Trait-based API | dyn ANNIndex |
Unified trait across all index types via AnyANNIndex |
Features
[]
= { = "0.2.0", = ["hnsw"] }
hnsw— HNSW graph index (default, withinnrSIMD)innr— Pure Rust SIMD distance kernels (default; alternative:simsimdfor C-based SIMD)simsimd— SimSIMD C bindings for distance computation (replacesinnrwhen enabled)serde— JSON serialization for index save/load (save_to_writer/load_from_reader)parallel— Rayon-based batch search (search_batch,search_batch_flat)nsw— Flat navigable small-world graph (alternative to HNSW, no hierarchy)sng— OPT-SNG auto-tuned sparse neighborhood graph (experimental)diskann/vamana— DiskANN-style graph variants (experimental)ivf_pq— Inverted File with Product Quantization (activatesclumpfor k-means clustering)scann— ScaNN-style coarse-to-fine scaffolding (experimental; activatesclump)evoc— EVoC hierarchical clustering (activatesclump)quantization/rabitq/saq— vector quantization and RaBitQ-style compressionpersistence— on-disk persistence helpers (requires thedurabilitycrate)python— optional PyO3 bindings (feature-gated)
Compiles on wasm32-unknown-unknown with default features.
Flat vs hierarchical graphs (why “H” may not matter)
HNSW’s hierarchy was designed to provide multi-scale “express lanes” for long-range navigation. However, recent empirical work suggests that on high-dimensional embedding datasets a flat navigable small-world graph can retain the key recall/latency benefits of HNSW, because “hub” nodes emerge and already provide effective routing.
Concrete reference:
- Munyampirwa et al. (2024). Down with the Hierarchy: The 'H' in HNSW Stands for "Hubs" (arXiv:2412.01940).
Practical guidance in vicinity:
- Try
HNSW{m}first (default; well-tested). - If you want to experiment with a simpler flat graph, enable
nswand tryNSW{m}via the factory. Benchmark recall@k vs latency on your workload; the “best” choice depends on data and constraints.
Running benchmarks / examples
Quick benchmark (generates synthetic data if no pre-built files exist):
With real ann-benchmarks datasets:
# Download and convert (requires Python + h5py)
# List available datasets
Criterion microbenchmarks:
Examples quick reference
| Example | Features | What it demonstrates |
|---|---|---|
01_basic_search |
hnsw (default) |
Minimal build-and-search workflow |
02_measure_recall |
hnsw (default) |
Recall@k measurement against brute force |
03_quick_benchmark |
hnsw (default) |
Benchmark with synthetic or bundled data (no downloads) |
04_rigorous_benchmark |
hnsw (default) |
Multi-run benchmark with confidence intervals |
05_normalization_matters |
hnsw (default) |
Why L2-normalization is required for HNSW cosine search |
hnsw_benchmark |
hnsw (default) |
HNSW vs brute-force speed and recall |
semantic_search_demo |
hnsw (default) |
End-to-end semantic search pipeline |
ivf_pq_demo |
ivf_pq |
IVF-PQ index construction and querying |
rabitq_demo |
rabitq, hnsw, quantization |
RaBitQ binary quantization |
evoc_demo |
evoc |
EVoC hierarchical clustering |
lid_demo |
hnsw (default) |
LID estimation on synthetic data |
lid_outlier_detection |
hnsw (default) |
LID-based anomaly detection |
dual_branch_demo |
hnsw (default) |
LID-driven dual-branch HNSW |
dual_branch_hnsw_demo |
hnsw (default) |
LID-aware graph construction |
sift_benchmark |
hnsw |
Synthetic 50K benchmark (SIFT-like) |
glove_benchmark |
hnsw (default) |
GloVe-25 real dataset benchmark |
retrieve_and_rerank |
hnsw |
Two-stage retrieval with reranking |
ann_benchmark |
hnsw |
ann-benchmarks runner (real datasets, HNSW + NSW) |
For primary sources (papers) backing the algorithms and phenomena mentioned in docs, see doc/references.md.
References
- Malkov & Yashunin (2016/2018). Efficient and robust approximate nearest neighbor search using HNSW graphs (HNSW).
https://arxiv.org/abs/1603.09320 - Malkov et al. (2014). Approximate nearest neighbor algorithm based on navigable small world graphs (NSW).
https://doi.org/10.1016/j.is.2013.10.006 - Munyampirwa et al. (2024). Down with the Hierarchy: The “H” in HNSW Stands for “Hubs”.
https://arxiv.org/abs/2412.01940 - Subramanya et al. (2019). DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node.
https://proceedings.neurips.cc/paper/2019/hash/09853c7fb1d3f8ee67a61b6bf4a7f8e6-Abstract.html - Jégou, Douze, Schmid (2011). Product Quantization for Nearest Neighbor Search (PQ / IVFADC).
https://ieeexplore.ieee.org/document/5432202 - Ge et al. (2014). Optimized Product Quantization (OPQ).
https://arxiv.org/abs/1311.4055 - Guo et al. (2020). Accelerating Large-Scale Inference with Anisotropic Vector Quantization (ScaNN line).
https://arxiv.org/abs/1908.10396 - Gao & Long (2024). RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search.
https://arxiv.org/abs/2405.12497
See also
innr-- SIMD-accelerated similarity kernels (direct dependency for distance computation)
License
MIT OR Apache-2.0