hnsw_vector_search 0.1.0

High-performance semantic search using HNSW and ONNX embeddings
Documentation
# HNSW Vector Search ๐Ÿš€

A high-performance semantic search engine using **Hierarchical Navigable Small World (HNSW)** graphs and **ONNX transformer embeddings**.

[![Rust](https://img.shields.io/badge/rust-1.75%2B-orange.svg)](https://www.rust-lang.org/)
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)

## Features

- โšก **Fast search**: O(log N) approximate nearest neighbor search
- ๐ŸŽฏ **High recall**: 95%+ recall@10 with optimized parameters
- ๐Ÿš€ **SIMD optimized**: 3-4x speedup using ARM NEON on Apple Silicon
- ๐Ÿค– **Real embeddings**: Integration with ONNX Runtime for transformer models
- ๐Ÿ’พ **Persistent storage**: Memory-mapped indices for instant loading
- ๐Ÿ› ๏ธ **CLI tool**: Easy-to-use command-line interface
- ๐Ÿ“Š **Production-ready**: Comprehensive benchmarks and documentation

## Quick Start

### Installation

```bash
# Clone the repository
git clone https://github.com/reemabdullah/hnsw-vector-search
cd hnsw-vector-search

# Build the project
cargo build --release

# The CLI tool will be at target/release/hnsw
```

### Download Models

```bash
mkdir -p models
cd models

# Download ONNX model (all-MiniLM-L6-v2)
curl -L -o all-MiniLM-L6-v2.onnx \
  "https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/onnx/model.onnx"

# Download tokenizer
curl -L -o tokenizer.json \
  "https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/tokenizer.json"

cd ..
```

### Build an Index

Create a file `documents.txt` with one document per line:

```text
Rust is a systems programming language focused on safety
Python is popular for data science and machine learning
JavaScript runs in web browsers for interactive websites
```

Build the index:

```bash
./target/release/hnsw build \
  --input documents.txt \
  --output my_index.bin \
  --m 16 \
  --ef-construction 200
```

### Search

Interactive search:

```bash
./target/release/hnsw search \
  --index my_index.bin \
  --docs my_index_docs.txt
```

Single query:

```bash
./target/release/hnsw search \
  --index my_index.bin \
  --docs my_index_docs.txt \
  --query "programming languages" \
  --k 5
```

## Library Usage

Add to your `Cargo.toml`:

```toml
[dependencies]
hnsw_vector_search = "0.1.0"
```

Example code:

```rust
use hnsw_vector_search::{HnswGraph, OnnxEmbedder};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Load embedding model
    let mut embedder = OnnxEmbedder::new(
        "models/all-MiniLM-L6-v2.onnx",
        "models/tokenizer.json"
    )?;

    // Create index
    let mut index = HnswGraph::new(16, 200);

    // Add documents
    let documents = vec![
        "Rust is fast and memory-safe",
        "Python is great for ML",
        "JavaScript powers the web",
    ];

    for doc in &documents {
        let embedding = embedder.embed(doc)?;
        index.insert(embedding);
    }

    // Search
    let query = embedder.embed("programming languages")?;
    let results = index.search(&query, 3, 50);

    for (i, (doc_id, distance)) in results.iter().enumerate() {
        println!("{}. {} (distance: {:.3})",
                 i + 1, documents[*doc_id], distance);
    }

    Ok(())
}
```

## Performance

Benchmarks on Apple M4 Pro (1000 vectors, 384 dimensions):

| Operation            | Time  | Throughput      |
| -------------------- | ----- | --------------- |
| Search (k=10, ef=50) | 12 ยตs | 80K queries/sec |
| Insertion            | 90 ยตs | 11K inserts/sec |
| Embedding            | 18 ms | 55 embeds/sec   |
| Index load (mmap)    | 2 ms  | -               |

### SIMD Speedup

ARM NEON vs scalar on Apple Silicon:

| Operation             | Fallback | NEON  | Speedup |
| --------------------- | -------- | ----- | ------- |
| Dot product (384-dim) | 45 ns    | 12 ns | 3.75x   |
| Normalize             | 120 ns   | 35 ns | 3.43x   |
| Euclidean distance    | 68 ns    | 18 ns | 3.78x   |

## Algorithm

HNSW creates a multi-layer graph:

```
Layer 2:  A โ† sparse, for long-range navigation
          |
Layer 1:  A-B-C โ† medium density
          |/|/|
Layer 0:  A-B-C-D-E-F โ† dense, all vectors
```

Search proceeds top-down:

1. Start at entry point in highest layer
2. Greedily navigate to nearest neighbor
3. Drop to next layer
4. Repeat until layer 0
5. Return k nearest neighbors

**Time Complexity**: O(log N) expected

## Parameters

### Build Parameters

- **M** (default: 16): Max connections per node
  - 8-12: Fast build, less memory, ~90% recall
  - 16-24: Balanced (recommended)
  - 32-48: High recall (~98%), more memory

- **ef_construction** (default: 200): Build quality
  - 100: Fast build, decent quality
  - 200: Balanced (recommended)
  - 400: Slow build, excellent quality

### Search Parameters

- **k**: Number of results
- **ef** (default: 50): Search quality
  - ef = k: Fast, may miss results
  - ef = 2ร—k: Balanced (recommended)
  - ef = 4ร—k: High recall, slower

## CLI Commands

### Build

```bash
hnsw build [OPTIONS]

Options:
  -i, --input <FILE>              Input documents (one per line)
  -o, --output <FILE>             Output index file
  -m, --model <FILE>              ONNX model path
  -t, --tokenizer <FILE>          Tokenizer path
      --m <NUM>                   Max connections (default: 16)
      --ef-construction <NUM>     Build quality (default: 200)
```

### Search

```bash
hnsw search [OPTIONS]

Options:
  -i, --index <FILE>              Index file
  -m, --model <FILE>              ONNX model path
  -t, --tokenizer <FILE>          Tokenizer path
  -d, --docs <FILE>               Documents file
  -k <NUM>                        Number of results (default: 10)
      --ef <NUM>                  Search quality (default: 50)
  -q, --query <TEXT>              Query (omit for interactive mode)
```

### Info

```bash
hnsw info --index <FILE>
```

### Benchmark

```bash
hnsw bench [OPTIONS]

Options:
  -i, --index <FILE>              Index file
  -n, --num-queries <NUM>         Number of queries (default: 1000)
```

## Project Structure

```
hnsw-vector-search/
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ simd/           # SIMD vector operations
โ”‚   โ”‚   โ”œโ”€โ”€ neon.rs     # ARM NEON intrinsics
โ”‚   โ”‚   โ””โ”€โ”€ fallback.rs # Portable fallback
โ”‚   โ”œโ”€โ”€ hnsw/           # HNSW algorithm
โ”‚   โ”‚   โ”œโ”€โ”€ graph.rs    # Graph structure
โ”‚   โ”‚   โ”œโ”€โ”€ search.rs   # Search algorithm
โ”‚   โ”‚   โ””โ”€โ”€ insert.rs   # Insertion algorithm
โ”‚   โ”œโ”€โ”€ embeddings/     # ONNX integration
โ”‚   โ”œโ”€โ”€ persistence/    # Serialization
โ”‚   โ””โ”€โ”€ cli/            # Command-line tool
โ”œโ”€โ”€ benches/            # Criterion benchmarks
โ””โ”€โ”€ examples/           # Usage examples
```

## Contributing

Contributions welcome! Please:

1. Fork the repository
2. Create a feature branch
3. Add tests for new functionality
4. Run `cargo test` and `cargo clippy`
5. Submit a pull request

## License

MIT License - see LICENSE file for details

## References

- [HNSW Paper]https://arxiv.org/abs/1603.09320 - Malkov & Yashunin (2018)
- [ONNX Runtime]https://onnxruntime.ai/
- [Sentence Transformers]https://www.sbert.net/

## Acknowledgments

Built with โค๏ธ using Rust and inspired by production vector databases like Pinecone, Weaviate, and Qdrant.