HNSW Vector Search ๐
A high-performance semantic search engine using Hierarchical Navigable Small World (HNSW) graphs and ONNX transformer embeddings.
Features
- โก Fast search: O(log N) approximate nearest neighbor search
- ๐ฏ High recall: 95%+ recall@10 with optimized parameters
- ๐ SIMD optimized: 3-4x speedup using ARM NEON on Apple Silicon
- ๐ค Real embeddings: Integration with ONNX Runtime for transformer models
- ๐พ Persistent storage: Memory-mapped indices for instant loading
- ๐ ๏ธ CLI tool: Easy-to-use command-line interface
- ๐ Production-ready: Comprehensive benchmarks and documentation
Quick Start
Installation
# Clone the repository
# Build the project
# The CLI tool will be at target/release/hnsw
Download Models
# Download ONNX model (all-MiniLM-L6-v2)
# Download tokenizer
Build an Index
Create a file documents.txt with one document per line:
Rust is a systems programming language focused on safety
Python is popular for data science and machine learning
JavaScript runs in web browsers for interactive websites
Build the index:
Search
Interactive search:
Single query:
Library Usage
Add to your Cargo.toml:
[]
= "0.1.0"
Example code:
use ;
Performance
Benchmarks on Apple M4 Pro (1000 vectors, 384 dimensions):
| Operation | Time | Throughput |
|---|---|---|
| Search (k=10, ef=50) | 12 ยตs | 80K queries/sec |
| Insertion | 90 ยตs | 11K inserts/sec |
| Embedding | 18 ms | 55 embeds/sec |
| Index load (mmap) | 2 ms | - |
SIMD Speedup
ARM NEON vs scalar on Apple Silicon:
| Operation | Fallback | NEON | Speedup |
|---|---|---|---|
| Dot product (384-dim) | 45 ns | 12 ns | 3.75x |
| Normalize | 120 ns | 35 ns | 3.43x |
| Euclidean distance | 68 ns | 18 ns | 3.78x |
Algorithm
HNSW creates a multi-layer graph:
Layer 2: A โ sparse, for long-range navigation
|
Layer 1: A-B-C โ medium density
|/|/|
Layer 0: A-B-C-D-E-F โ dense, all vectors
Search proceeds top-down:
- Start at entry point in highest layer
- Greedily navigate to nearest neighbor
- Drop to next layer
- Repeat until layer 0
- Return k nearest neighbors
Time Complexity: O(log N) expected
Parameters
Build Parameters
-
M (default: 16): Max connections per node
- 8-12: Fast build, less memory, ~90% recall
- 16-24: Balanced (recommended)
- 32-48: High recall (~98%), more memory
-
ef_construction (default: 200): Build quality
- 100: Fast build, decent quality
- 200: Balanced (recommended)
- 400: Slow build, excellent quality
Search Parameters
- k: Number of results
- ef (default: 50): Search quality
- ef = k: Fast, may miss results
- ef = 2รk: Balanced (recommended)
- ef = 4รk: High recall, slower
CLI Commands
Build
)
)
)
Search
)
)
)
Info
Benchmark
)
Project Structure
hnsw-vector-search/
โโโ src/
โ โโโ simd/ # SIMD vector operations
โ โ โโโ neon.rs # ARM NEON intrinsics
โ โ โโโ fallback.rs # Portable fallback
โ โโโ hnsw/ # HNSW algorithm
โ โ โโโ graph.rs # Graph structure
โ โ โโโ search.rs # Search algorithm
โ โ โโโ insert.rs # Insertion algorithm
โ โโโ embeddings/ # ONNX integration
โ โโโ persistence/ # Serialization
โ โโโ cli/ # Command-line tool
โโโ benches/ # Criterion benchmarks
โโโ examples/ # Usage examples
Contributing
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Run
cargo testandcargo clippy - Submit a pull request
License
MIT License - see LICENSE file for details
References
- HNSW Paper - Malkov & Yashunin (2018)
- ONNX Runtime
- Sentence Transformers
Acknowledgments
Built with โค๏ธ using Rust and inspired by production vector databases like Pinecone, Weaviate, and Qdrant.