vicinity

Approximate nearest-neighbor search. Each algorithm is a separate feature flag. Depend on only what you use. Rust crate vicinity; Python bindings ship as pyvicinity on PyPI.

Default distance is cosine; most graph indices L2-normalize on insert and operate on the cosine-equivalent unit sphere (so angular and inner product over normalized vectors fall out for free). True L2 distance is natively wired only in ivf_avq (which targets MIPS). The DistanceMetric enum in distance.rs is consumed by evaluation utilities and brute-force comparison; per-index metric selection is algorithm-specific.

Which index?

Which index?
├── General purpose: HNSW (default)
├── Memory constrained: HNSW + RaBitQ (SymphonyQG) or IVF-PQ
├── Disk-backed / large scale: DiskANN (experimental; file-based, mmap planned)
├── Streaming insert/delete: FreshGraph (per-op latency) or LsmIndex (write throughput)
├── Filtered search: ACORN (metadata filters) or Curator (label filters)
├── Batch/static: IVF-PQ or IVF-AVQ
└── Sparse vectors: SparseMIPS

For high-dimensional data (d ≥ 256), prefer SQ4U or SymphonyQG over plain HNSW. Quantized graph traversal reduces distance computation cost. At low dimensions (d ≤ 25), plain HNSW wins; quantization overhead outweighs savings.

Install

Requires Rust 1.89+. Each algorithm is a separate feature; enable what you need:

[dependencies]
vicinity = { version = "0.8", features = ["hnsw"] }          # graph index
# vicinity = { version = "0.8", features = ["ivf_pq"] }      # compressed index
# vicinity = { version = "0.8", features = ["nsw"] }         # flat graph

Usage

HNSW

High recall, in-memory. Default index.

use vicinity::hnsw::HNSWIndex;

let mut index = HNSWIndex::builder(128)
    .m(16)
    .ef_search(50)
    .auto_normalize(true) // cosine (default) requires unit-norm vectors
    .build()?;
index.add_slice(0, &[0.1; 128])?;
index.add_slice(1, &[0.2; 128])?;
index.build()?;

let results = index.search(&[0.1; 128], 5, 50)?;
// results: Vec<(doc_id, distance)>; distance in [0, 2], lower is closer.

IVF-PQ

Compressed index. 32–64× less memory than HNSW, lower recall. Use for datasets that don't fit in RAM.

use vicinity::ivf_pq::{IVFPQIndex, IVFPQParams};

// IVF-PQ trains a quantizer on `build()`; aim for ≳ codebook_size × num_codebooks
// training vectors (defaults: codebook_size=256, num_codebooks=8 → ≳ 2048).
let params = IVFPQParams { num_clusters: 1024, num_codebooks: 8, nprobe: 16, ..Default::default() };
let mut index = IVFPQIndex::new(128, params)?;
for (id, vec) in dataset.iter().enumerate() {
    index.add_slice(id as u32, vec)?;
}
index.build()?;

let results = index.search(&query, 5)?;

See examples/ivf_pq_demo.rs for a runnable end-to-end example.

Python (pyvicinity)

HNSW is available from Python via pyvicinity (abi3 wheel, CPython 3.9+). Useful for semantic search, recommendation, deduplication, and any other task where you have N embedding vectors and need to find the k closest to a query vector.

pip install pyvicinity

import numpy as np
from pyvicinity import HNSWIndex, DistanceMetric

# embeddings: any (n, 384) float32 array (sentence-transformers, openai, etc.)
embeddings = np.random.default_rng(0).standard_normal((10_000, 384), dtype=np.float32)

index = HNSWIndex(
    dim=384,
    metric=DistanceMetric.Cosine,
    auto_normalize=True,  # normalizes inserts and queries
    seed=42,
)
index.add_items(embeddings)
index.build()

# single query: top-10 neighbors. Distances in [0, 2]; lower means more similar.
ids, dists = index.search(embeddings[0], k=10)

# batch: same shape, one row per query.
queries = embeddings[:32]
batch_ids, batch_dists = index.batch_search(queries, k=10)  # (32, 10) int64

Runnable examples (in the source repo, under examples/python/, not shipped with the wheel):

01_text_similarity.py: semantic search over text with sentence-transformers
02_batch_and_recall.py: recall@10 vs ef_search sweep
03_ann_benchmarks_harness.py: drop-in ann-benchmarks / VIBE wrapper

The bindings ship hand-written .pyi stubs (py.typed) and are verified in CI by mypy.stubtest.

Persistence

Save and load indexes with the serde feature:

[dependencies]
vicinity = { version = "0.8", features = ["hnsw", "serde"] }

// Save
index.save_to_file("my_index.json")?;

// Load
let index = HNSWIndex::load_from_file("my_index.json")?;

See examples/06_save_and_load.rs for a full example.

Benchmark

GloVe-25 (1.18M vectors, 25-d, angular distance), Apple Silicon, single-threaded:

Summary at best recall per algorithm:

Algorithm	Recall@10	QPS
HNSW (M=16)	100.0%	2,857
Vamana	100.0%	1,177
DiskANN	100.0%	1,029
NSW (M=16)	99.2%	1,288
IVF-PQ	98.7%	69
IVF-AVQ	90.9%	194
RP-Forest	58.5%	4,221

Full numbers and SIFT-128 results in docs/benchmark-results.md.

Supported Algorithms

Each algorithm has a named feature flag:

Algorithm	Feature	Notes
HNSW	`hnsw` (default)	Best recall/QPS balance for in-memory search
SQ4U	`hnsw` + `sq4`	HNSW with 4-bit quantized graph traversal + exact rerank; benefits high-d data
SymphonyQG	`hnsw` + `ivf_rabitq`	HNSW with RaBitQ quantized graph traversal; cheap approximate beam search + exact rerank
NSW	`nsw`	Flat small-world graph; competitive with HNSW on high-d data
Vamana	`vamana`	DiskANN-style robust pruning; fast search, higher build time
NSG	`nsg`	Monotonic RNG pruning; build slows above ~50K vectors due to O(n) ensure_connectivity
EMG	`emg`	Multi-scale graph with alpha-pruning
FINGER	`finger`	Projection-based distance lower bounds for search pruning
PiPNN	`pipnn`	Partition-then-refine with HashPrune; reduces I/O during build
FreshGraph	`fresh_graph`	Streaming insert/delete with tombstones
IVF-PQ	`ivf_pq`	Compressed index; 32-64x less memory, lower recall
IVF-AVQ	`ivf_avq`	Anisotropic VQ + reranking; inner product search
IVF-RaBitQ	`ivf_rabitq`	RaBitQ binary quantization; provable error bounds
RpQuant	`rp_quant`	Random projection + scalar quantization
BinaryFlat	`binary_index`	1-bit quantization + full-precision rerank
Curator	`curator`	K-means tree with per-label Bloom filters; low-selectivity filtered search
FilteredGraph	`filtered_graph`	Predicate-filtered graph search (AND/OR metadata filters)
ACORN	`hnsw`	Filtered HNSW search with subgraph sampling (SIGMOD 2024)
RangeFiltered	`range_filtered`	HNSW + attribute-range post-filter (renamed from `esg` in 0.8.0)
SparseMIPS	`sparse_mips`	Graph index for sparse vectors (SPLADE/BM25)
LEMUR	`lemur`	Late-interaction MIPS; needs externally-provided encoder weights (no in-tree training); mean-pool used in place of OLS
LSH	`lsh`	Cross-Polytope LSH (Andoni et al. 2015); single hash table + multiprobe
LsmIndex	`hnsw`	LSM-tree tiered HNSW for streaming insert/delete/update workloads
DiskANN	`diskann`	Vamana + SSD I/O layout; experimental
SNG	`sng`	OPT-SNG (auto-tuned sparse neighborhood graph); sub-quadratic build per arXiv:2509.15531
DEG	`hnsw`	Density-adaptive edge budgets (in-house experimental variant; no benchmark)
KD-Tree	`kdtree`	Exact NN; fast for d <= 20 (experimental)
Ball Tree	`balltree`	Exact NN; slightly better than KD-Tree for d=20-50 (experimental)
RP-Forest	`rptree`	Approximate; fast build, moderate recall (experimental)

Quantization: RaBitQ, SAQ (quantization feature, via qntz crate). PQ is part of ivf_pq.

See docs.rs for the full API.

Documentation

User guide: quick start, distance metrics, LID, common pitfalls
Benchmarks: recall/QPS tables across datasets
ANN landscape: algorithmic principles, math foundations, research context
References: bibliography for every algorithm in the crate

License

MIT OR Apache-2.0

vicinity 0.8.1