# vector-index
[](https://crates.io/crates/vector-index)
[](https://docs.rs/vector-index)
[](https://github.com/elli0t-yash/vector-index#license)
Approximate nearest neighbor search in Rust, with pluggable distance metrics.
## What it is
An HNSW (Hierarchical Navigable Small World) index that's generic over the distance metric. Plug in L2, cosine, or any custom distance function — the index doesn't care.
## Quickstart
Add to `Cargo.toml`:
```toml
[dependencies]
vector-index = "0.1"
```
Use it:
```rust
use vector_index::{HnswIndex, HnswConfig, L2};
// Build an index with the L2 metric
let mut index = HnswIndex::new(L2, HnswConfig::default());
// Insert some vectors
index.insert(vec![1.0, 0.0, 0.0]).unwrap();
index.insert(vec![0.0, 1.0, 0.0]).unwrap();
index.insert(vec![0.0, 0.0, 1.0]).unwrap();
// Search for the 2 nearest neighbors
let query = vec![0.9, 0.1, 0.0];
let neighbors = index.search(&query, 2);
for n in &neighbors {
println!("id={} distance={}", n.id, n.distance);
}
```
## Why this crate
Most Rust ANN crates hardcode their distance function — usually L2 or cosine. If you want to use a different metric (Sliced-Wasserstein, custom Hamming, learned distances, anything), you have to either fork the crate or build the index yourself.
This crate solves that. The `Metric` trait is the only API the index needs to know about your distance function:
```rust
pub trait Metric {
type Point;
fn distance(&self, a: &Self::Point, b: &Self::Point) -> f32;
fn dim(&self, point: &Self::Point) -> usize;
}
```
Implement it for your type, plug it into `HnswIndex<P, M>`, and search.
## Built-in metrics
- `L2` — squared Euclidean distance, suitable for embeddings
- `Cosine` — cosine distance, suitable for direction-sensitive comparisons
For Sliced-Wasserstein over empirical distributions (point clouds), see the companion crate [`sliced-wasserstein`](https://crates.io/crates/sliced-wasserstein).
## Concurrent access
Use `ConcurrentHnsw` for multi-threaded reads with occasional writes:
```rust
use vector_index::{ConcurrentHnsw, HnswConfig, L2};
use std::sync::Arc;
let index = Arc::new(ConcurrentHnsw::new(L2, HnswConfig::default()));
// Multiple threads can search concurrently
let neighbors = index.search(&query, 10);
```
## Configuration
`HnswConfig` exposes the standard HNSW parameters:
| `m` | 16 | Max neighbors per node. Higher = better recall, more memory. |
| `m_max0` | 32 | Max neighbors at level 0. Default is `2 * m`. |
| `ef_construction` | 200 | Build-time candidate pool. Higher = better recall, slower build. |
| `ef_search` | 50 | Search-time candidate pool. Higher = better recall, slower search. |
For most workloads the defaults work well. Tune `ef_search` if you need different recall/latency tradeoffs.
## Status
`vector-index` is at 0.1.0. The core HNSW algorithm is implemented (Algorithm 4 neighbor selection from the original paper), tested with recall verified on synthetic Gaussian benchmarks, and used in production by [OmniPulse](https://github.com/elli0t-yash) for distributional fingerprint retrieval.
Currently does **not** support:
- Deletion (HNSW deletion is non-trivial; planned for 0.2.0)
- Persistence to disk (in-memory only)
- Serialization of the index structure
## License
Licensed under either of:
- Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE))
- MIT License ([LICENSE-MIT](LICENSE-MIT))
at your option.