# HNSW Vector Search ๐
A high-performance semantic search engine using **Hierarchical Navigable Small World (HNSW)** graphs and **ONNX transformer embeddings**.
[](https://www.rust-lang.org/)
[](LICENSE)
## Features
- โก **Fast search**: O(log N) approximate nearest neighbor search
- ๐ฏ **High recall**: 95%+ recall@10 with optimized parameters
- ๐ **SIMD optimized**: 3-4x speedup using ARM NEON on Apple Silicon
- ๐ค **Real embeddings**: Integration with ONNX Runtime for transformer models
- ๐พ **Persistent storage**: Memory-mapped indices for instant loading
- ๐ ๏ธ **CLI tool**: Easy-to-use command-line interface
- ๐ **Production-ready**: Comprehensive benchmarks and documentation
## Quick Start
### Installation
```bash
# Clone the repository
git clone https://github.com/reemabdullah/hnsw-vector-search
cd hnsw-vector-search
# Build the project
cargo build --release
# The CLI tool will be at target/release/hnsw
```
### Download Models
```bash
mkdir -p models
cd models
# Download ONNX model (all-MiniLM-L6-v2)
curl -L -o all-MiniLM-L6-v2.onnx \
"https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/onnx/model.onnx"
# Download tokenizer
curl -L -o tokenizer.json \
"https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/tokenizer.json"
cd ..
```
### Build an Index
Create a file `documents.txt` with one document per line:
```text
Rust is a systems programming language focused on safety
Python is popular for data science and machine learning
JavaScript runs in web browsers for interactive websites
```
Build the index:
```bash
./target/release/hnsw build \
--input documents.txt \
--output my_index.bin \
--m 16 \
--ef-construction 200
```
### Search
Interactive search:
```bash
./target/release/hnsw search \
--index my_index.bin \
--docs my_index_docs.txt
```
Single query:
```bash
./target/release/hnsw search \
--index my_index.bin \
--docs my_index_docs.txt \
--query "programming languages" \
--k 5
```
## Library Usage
Add to your `Cargo.toml`:
```toml
[dependencies]
hnsw_vector_search = "0.1.0"
```
Example code:
```rust
use hnsw_vector_search::{HnswGraph, OnnxEmbedder};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load embedding model
let mut embedder = OnnxEmbedder::new(
"models/all-MiniLM-L6-v2.onnx",
"models/tokenizer.json"
)?;
// Create index
let mut index = HnswGraph::new(16, 200);
// Add documents
let documents = vec![
"Rust is fast and memory-safe",
"Python is great for ML",
"JavaScript powers the web",
];
for doc in &documents {
let embedding = embedder.embed(doc)?;
index.insert(embedding);
}
// Search
let query = embedder.embed("programming languages")?;
let results = index.search(&query, 3, 50);
for (i, (doc_id, distance)) in results.iter().enumerate() {
println!("{}. {} (distance: {:.3})",
i + 1, documents[*doc_id], distance);
}
Ok(())
}
```
## Performance
Benchmarks on Apple M4 Pro (1000 vectors, 384 dimensions):
| Search (k=10, ef=50) | 12 ยตs | 80K queries/sec |
| Insertion | 90 ยตs | 11K inserts/sec |
| Embedding | 18 ms | 55 embeds/sec |
| Index load (mmap) | 2 ms | - |
### SIMD Speedup
ARM NEON vs scalar on Apple Silicon:
| Dot product (384-dim) | 45 ns | 12 ns | 3.75x |
| Normalize | 120 ns | 35 ns | 3.43x |
| Euclidean distance | 68 ns | 18 ns | 3.78x |
## Algorithm
HNSW creates a multi-layer graph:
```
Layer 2: A โ sparse, for long-range navigation
|
Layer 1: A-B-C โ medium density
|/|/|
Layer 0: A-B-C-D-E-F โ dense, all vectors
```
Search proceeds top-down:
1. Start at entry point in highest layer
2. Greedily navigate to nearest neighbor
3. Drop to next layer
4. Repeat until layer 0
5. Return k nearest neighbors
**Time Complexity**: O(log N) expected
## Parameters
### Build Parameters
- **M** (default: 16): Max connections per node
- 8-12: Fast build, less memory, ~90% recall
- 16-24: Balanced (recommended)
- 32-48: High recall (~98%), more memory
- **ef_construction** (default: 200): Build quality
- 100: Fast build, decent quality
- 200: Balanced (recommended)
- 400: Slow build, excellent quality
### Search Parameters
- **k**: Number of results
- **ef** (default: 50): Search quality
- ef = k: Fast, may miss results
- ef = 2รk: Balanced (recommended)
- ef = 4รk: High recall, slower
## CLI Commands
### Build
```bash
hnsw build [OPTIONS]
Options:
-i, --input <FILE> Input documents (one per line)
-o, --output <FILE> Output index file
-m, --model <FILE> ONNX model path
-t, --tokenizer <FILE> Tokenizer path
--m <NUM> Max connections (default: 16)
--ef-construction <NUM> Build quality (default: 200)
```
### Search
```bash
hnsw search [OPTIONS]
Options:
-i, --index <FILE> Index file
-m, --model <FILE> ONNX model path
-t, --tokenizer <FILE> Tokenizer path
-d, --docs <FILE> Documents file
-k <NUM> Number of results (default: 10)
--ef <NUM> Search quality (default: 50)
-q, --query <TEXT> Query (omit for interactive mode)
```
### Info
```bash
hnsw info --index <FILE>
```
### Benchmark
```bash
hnsw bench [OPTIONS]
Options:
-i, --index <FILE> Index file
-n, --num-queries <NUM> Number of queries (default: 1000)
```
## Project Structure
```
hnsw-vector-search/
โโโ src/
โ โโโ simd/ # SIMD vector operations
โ โ โโโ neon.rs # ARM NEON intrinsics
โ โ โโโ fallback.rs # Portable fallback
โ โโโ hnsw/ # HNSW algorithm
โ โ โโโ graph.rs # Graph structure
โ โ โโโ search.rs # Search algorithm
โ โ โโโ insert.rs # Insertion algorithm
โ โโโ embeddings/ # ONNX integration
โ โโโ persistence/ # Serialization
โ โโโ cli/ # Command-line tool
โโโ benches/ # Criterion benchmarks
โโโ examples/ # Usage examples
```
## Contributing
Contributions welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Add tests for new functionality
4. Run `cargo test` and `cargo clippy`
5. Submit a pull request
## License
MIT License - see LICENSE file for details
## References
- [HNSW Paper](https://arxiv.org/abs/1603.09320) - Malkov & Yashunin (2018)
- [ONNX Runtime](https://onnxruntime.ai/)
- [Sentence Transformers](https://www.sbert.net/)
## Acknowledgments
Built with โค๏ธ using Rust and inspired by production vector databases like Pinecone, Weaviate, and Qdrant.