vector-index
Approximate nearest neighbor search in Rust, with pluggable distance metrics.
What it is
An HNSW (Hierarchical Navigable Small World) index that's generic over the distance metric. Plug in L2, cosine, or any custom distance function — the index doesn't care.
Quickstart
Add to Cargo.toml:
[]
= "0.1"
Use it:
use ;
// Build an index with the L2 metric
let mut index = new;
// Insert some vectors
index.insert.unwrap;
index.insert.unwrap;
index.insert.unwrap;
// Search for the 2 nearest neighbors
let query = vec!;
let neighbors = index.search;
for n in &neighbors
Why this crate
Most Rust ANN crates hardcode their distance function — usually L2 or cosine. If you want to use a different metric (Sliced-Wasserstein, custom Hamming, learned distances, anything), you have to either fork the crate or build the index yourself.
This crate solves that. The Metric trait is the only API the index needs to know about your distance function:
Implement it for your type, plug it into HnswIndex<P, M>, and search.
Built-in metrics
L2— squared Euclidean distance, suitable for embeddingsCosine— cosine distance, suitable for direction-sensitive comparisons
For Sliced-Wasserstein over empirical distributions (point clouds), see the companion crate sliced-wasserstein.
Concurrent access
Use ConcurrentHnsw for multi-threaded reads with occasional writes:
use ;
use Arc;
let index = new;
// Multiple threads can search concurrently
let neighbors = index.search;
Configuration
HnswConfig exposes the standard HNSW parameters:
| Parameter | Default | Notes |
|---|---|---|
m |
16 | Max neighbors per node. Higher = better recall, more memory. |
m_max0 |
32 | Max neighbors at level 0. Default is 2 * m. |
ef_construction |
200 | Build-time candidate pool. Higher = better recall, slower build. |
ef_search |
50 | Search-time candidate pool. Higher = better recall, slower search. |
For most workloads the defaults work well. Tune ef_search if you need different recall/latency tradeoffs.
Status
vector-index is at 0.1.0. The core HNSW algorithm is implemented (Algorithm 4 neighbor selection from the original paper), tested with recall verified on synthetic Gaussian benchmarks, and used in production by OmniPulse for distributional fingerprint retrieval.
Currently does not support:
- Deletion (HNSW deletion is non-trivial; planned for 0.2.0)
- Persistence to disk (in-memory only)
- Serialization of the index structure
License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE)
- MIT License (LICENSE-MIT)
at your option.