DiskANN Implementation in Rust

A Rust implementation of DiskANN (Disk-based Approximate Nearest Neighbor search) using the Vamana graph algorithm. This project provides an efficient and scalable solution for large-scale vector similarity search with minimal memory footprint.
When to Use diskann-rs
| Use diskann-rs when... |
Use in-memory indexes (hnsw_rs) when... |
| Index is larger than available RAM |
Index fits comfortably in RAM |
| You need incremental updates without rebuilding |
Build time is critical (one-time cost) |
| Memory-constrained environments (containers, edge) |
Maximum recall needed (98%+) |
| Multiple large indexes on same machine |
Single index, dedicated resources |
| Cost-sensitive deployments |
Latency-critical applications |
TL;DR: diskann-rs trades ~60% slower build time for 6-10x lower memory usage and 15x faster incremental updates.
Key Features
| Feature |
Description |
| Incremental Updates |
Add/delete vectors without rebuilding the entire index |
| Filtered Search |
Query with metadata predicates (e.g., category filters) |
| Composable Features |
Combine incremental + filtered + quantized on a single index |
| SIMD Acceleration |
Optimized distance calculations (AVX2, SSE4.1, NEON) |
| Product Quantization |
Compress vectors up to 64x with PQ encoding |
| Scalar Quantization |
F16 (2x) and Int8 (4x) compression with SIMD-accelerated distance |
| Memory-Mapped I/O |
Single-file storage with minimal RAM footprint |
| Byte Serialization |
Load indexes from bytes (network, embedded, no filesystem) |
| Benchmark Formats |
Read/write fvecs, ivecs, bvecs (standard ANN benchmark formats) |
| Parallel Processing |
Concurrent index building and batch queries |
Quick Start
Basic Index Operations
use anndists::dist::DistL2;
use diskann_rs::{DiskANN, DiskAnnParams};
let vectors: Vec<Vec<f32>> = vec![vec![0.1, 0.2, 0.3], vec![0.4, 0.5, 0.6]];
let index = DiskANN::<DistL2>::build_index_default(&vectors, DistL2 {}, "index.db")?;
let query = vec![0.1, 0.2, 0.4];
let neighbors: Vec<u32> = index.search(&query, 10, 256);
Incremental Updates (No Rebuild Required)
use anndists::dist::DistL2;
use diskann_rs::IncrementalDiskANN;
let vectors = vec![vec![0.0; 128]; 1000];
let index = IncrementalDiskANN::<DistL2>::build_default(&vectors, "index.db")?;
let new_vectors = vec![vec![1.0; 128]; 100];
index.add_vectors(&new_vectors)?;
index.delete_vectors(&[0, 1, 2])?;
if index.should_compact() {
index.compact("index_v2.db")?;
}
Filtered Search (Metadata Predicates)
use anndists::dist::DistL2;
use diskann_rs::{FilteredDiskANN, Filter};
let vectors = vec![vec![0.0; 128]; 1000];
let labels: Vec<Vec<u64>> = (0..1000).map(|i| vec![i % 10]).collect();
let index = FilteredDiskANN::<DistL2>::build(&vectors, &labels, "filtered.db")?;
let filter = Filter::label_eq(0, 5);
let results = index.search_filtered(&query, 10, 128, &filter);
let filter = Filter::and(vec![
Filter::label_eq(0, 5), Filter::label_range(1, 10, 100), ]);
Composable Incremental Index (Filtered + Quantized + Incremental)
use anndists::dist::DistL2;
use diskann_rs::{IncrementalDiskANN, IncrementalQuantizedConfig, QuantizerKind, Filter};
let vectors = vec![vec![0.0; 128]; 1000];
let labels: Vec<Vec<u64>> = (0..1000).map(|i| vec![i % 5]).collect();
let quant_config = IncrementalQuantizedConfig { rerank_size: 50 };
let index = IncrementalDiskANN::<DistL2>::build_full(
&vectors, &labels, "composable.db",
Default::default(), QuantizerKind::F16,
quant_config,
)?;
let filter = Filter::label_eq(0, 3);
let results = index.search_filtered(&query, 10, 128, &filter);
let new_vecs = vec![vec![1.0; 128]; 50];
let new_labels = vec![vec![2u64]; 50];
index.add_vectors_with_labels(&new_vecs, &new_labels)?;
index.delete_vectors(&[0, 1, 2])?;
let bytes = index.to_bytes();
Product Quantization (64x Compression)
use diskann_rs::pq::{ProductQuantizer, PQConfig};
let config = PQConfig {
num_subspaces: 8, num_centroids: 256, ..Default::default()
};
let pq = ProductQuantizer::train(&vectors, config)?;
let codes: Vec<Vec<u8>> = pq.encode_batch(&vectors);
let table = pq.create_distance_table(&query);
let dist = pq.distance_with_table(&table, &codes[0]);
SIMD-Accelerated Distance
use diskann_rs::{SimdL2, DiskANN, simd_info};
println!("{}", simd_info());
let index = DiskANN::<SimdL2>::build_index_default(&vectors, SimdL2, "index.db")?;
use diskann_rs::simd::{l2_squared, dot_product, cosine_distance};
let dist = l2_squared(&vec_a, &vec_b);
Scalar Quantization (F16 / Int8)
use diskann_rs::{F16Quantizer, Int8Quantizer, VectorQuantizer};
let f16q = F16Quantizer::new(128);
let codes = f16q.encode(&vector); let decoded = f16q.decode(&codes); let dist = f16q.asymmetric_distance(&query, &codes);
let int8q = Int8Quantizer::train(&training_vectors)?;
let codes = int8q.encode(&vector); let dist = int8q.asymmetric_distance(&query, &codes);
fn search_with_quantizer(q: &dyn VectorQuantizer, query: &[f32], codes: &[u8]) -> f32 {
q.asymmetric_distance(query, codes)
}
Byte Loading (No Filesystem Required)
use anndists::dist::DistL2;
use diskann_rs::DiskANN;
use std::sync::Arc;
let index = DiskANN::<DistL2>::build_index_default(&vectors, DistL2{}, "index.db")?;
let bytes: Vec<u8> = index.to_bytes();
let index = DiskANN::<DistL2>::from_bytes(bytes, DistL2{})?;
let shared: Arc<[u8]> = load_from_somewhere().into();
let index = DiskANN::<DistL2>::from_shared_bytes(shared, DistL2{})?;
let filtered_bytes = filtered_index.to_bytes();
let incremental_bytes = incremental_index.to_bytes();
Benchmark Format Support (fvecs/ivecs/bvecs)
use diskann_rs::formats::{read_fvecs, write_fvecs, read_ivecs, read_bvecs_as_f32};
let base_vectors = read_fvecs("sift_base.fvecs")?; let ground_truth = read_ivecs("sift_groundtruth.ivecs")?; let queries = read_fvecs("sift_query.fvecs")?;
let mnist = read_bvecs_as_f32("mnist.bvecs")?;
write_fvecs("my_vectors.fvecs", &vectors)?;
Performance
Why diskann-rs? Memory-Mapped I/O
Unlike in-memory indexes that require loading the entire graph into RAM, diskann-rs uses memory-mapped files. The OS loads only the pages you access, making it ideal for large-scale deployments:
| Workload |
diskann-rs |
hnsw_rs |
Savings |
| Light (10 queries) |
90 MB |
896 MB |
10x less RAM |
| Medium (100 queries) |
136 MB |
896 MB |
6.6x less RAM |
| Heavy (1K queries) |
147 MB |
896 MB |
6x less RAM |
| Stress (5K queries) |
139 MB |
896 MB |
6.4x less RAM |
Tested with 200K vectors, 128 dimensions. hnsw_rs must hold the full index in RAM; diskann-rs loads pages on-demand.
Benchmark Comparisons (vs hnsw_rs)
| Metric |
diskann-rs |
hnsw_rs |
Winner |
| QPS at 93% recall |
586 |
170 |
diskann-rs (3.4x) |
| Add vectors |
31,000 vec/s |
2,000 vec/s |
diskann-rs (15x) |
| Delete vectors |
Instant (tombstone) |
Full rebuild |
diskann-rs |
| Build time |
33s / 100K |
21s / 100K |
hnsw_rs (1.6x) |
| Max recall |
97.5% |
99.2% |
hnsw_rs |
Dataset Benchmarks
Benchmarks on Apple M1 Max:
| Dataset |
Vectors |
Build Time |
QPS |
Recall@10 |
| SIFT-1M |
1,000,000 |
295s |
8,590 |
99.6% |
| Fashion-MNIST |
60,000 |
111s |
18,000 |
98.8% |
| Random-50K |
50,000 |
38s |
2,200 |
85.6% |
Memory Efficiency
- ~330MB RAM for 2GB index (16% of file size)
- Product Quantization: 64x compression (512 bytes → 8 bytes per vector)
- Int8 Quantization: 4x compression (512 bytes → 128 bytes per vector)
- F16 Quantization: 2x compression (512 bytes → 256 bytes per vector)
Quantization Trade-offs
| Method |
Compression |
Recall@10 |
Use Case |
| None (f32) |
1x |
100% |
Maximum accuracy |
| F16 |
2x |
100% |
General purpose, nearly lossless |
| Int8 |
4x |
99% |
Memory-constrained, high accuracy |
| PQ-32 |
16x |
~62% |
Large-scale, re-ranking |
| PQ-8 |
64x |
~12% |
Massive scale, coarse filtering |
Architecture
Storage Abstraction
┌──────────────────────────────────────────────────────────────┐
│ Storage │
├──────────────────────────────────────────────────────────────┤
│ Mmap(Mmap) - Memory-mapped file (default, lazy I/O) │
│ Owned(Vec<u8>) - Owned bytes (network, embedded) │
│ Shared(Arc<[u8]>) - Reference-counted (multi-reader) │
└──────────────────────────────────────────────────────────────┘
All variants deref to &[u8], so search logic is unified.
File Layout
[ metadata_len:u64 ][ metadata (bincode) ][ padding to 1 MiB ]
[ vectors (n × dim × f32) ][ adjacency (n × max_degree × u32) ]
Quantization (Composable)
┌─────────────────────────────────────────────────────────────┐
│ VectorQuantizer trait │
├─────────────────────────────────────────────────────────────┤
│ encode(&[f32]) -> Vec<u8> - Compress vector │
│ decode(&[u8]) -> Vec<f32> - Decompress vector │
│ asymmetric_distance(q, c) - Query vs compressed │
│ compression_ratio(dim) - Bytes saved │
├─────────────────────────────────────────────────────────────┤
│ Implementations: │
│ F16Quantizer - 2x compression, ~lossless │
│ Int8Quantizer - 4x compression, trained scaling │
│ ProductQuantizer - 64x compression, codebook-based │
└─────────────────────────────────────────────────────────────┘
Incremental Updates (Composable Delta Layer)
┌──────────────────────────────────────────────────────────────┐
│ IncrementalDiskANN │
├──────────────────────────────────────────────────────────────┤
│ ┌───────────────┐ ┌───────────────┐ ┌──────────────────┐ │
│ │ Base Index │ │ Delta Layer │ │ Tombstones │ │
│ │ (mmap file) │ │ (in-memory) │ │ (HashSet) │ │
│ │ + labels? │ │ + mini-graph │ │ │ │
│ │ + codes? │ │ + labels? │ │ │ │
│ └───────┬───────┘ └───────┬───────┘ └──────────────────┘ │
│ └──────────┬───────┘ │
│ UnifiedView (GraphIndex trait) │
│ multi-seed beam search over both │
├──────────────────────────────────────────────────────────────┤
│ Optional: Labels → search_filtered() │
│ Quantizer → quantized distance (base codes) │
│ Both → filtered + quantized search │
└──────────────────────────────────────────────────────────────┘
Parameters
Build Parameters
| Parameter |
Default |
Description |
max_degree |
64 |
Maximum neighbors per node (32-64) |
build_beam_width |
128 |
Construction beam width (128-256) |
alpha |
1.2 |
Pruning diversity factor (1.2-2.0) |
Search Parameters
| Parameter |
Typical |
Trade-off |
beam_width |
128-512 |
Higher = better recall, slower |
k |
10-100 |
Number of neighbors to return |
Building and Testing
cargo build --release
cargo test --lib
cargo bench --bench benchmark
DISKANN_BENCH_LARGE=1 cargo bench --bench benchmark
Comparison with Other Libraries
vs rust-diskann
| Feature |
diskann-rs |
rust-diskann |
| Incremental updates |
Yes |
No |
| Filtered search |
Yes |
No |
| Composable features |
Yes (incremental + filtered + quantized) |
No |
| Product Quantization |
Yes (64x) |
No |
| Scalar Quantization |
Yes (F16 2x, Int8 4x) |
No |
| Byte loading (no files) |
Yes |
No |
| Benchmark formats |
Yes (fvecs/ivecs/bvecs) |
No |
| SIMD acceleration |
Yes |
Uses anndists |
| Memory-mapped I/O |
Yes |
Yes |
| Generic vector types |
f32 |
f32, u64, etc. |
vs hnsw_rs
| Feature |
diskann-rs |
hnsw_rs |
| Memory usage |
6-10x lower |
Full index in RAM |
| Incremental add |
15x faster |
Slower |
| Incremental delete |
Instant (tombstone) |
Full rebuild |
| Build time |
1.6x slower |
Faster |
| Max recall |
~97% |
~99% |
| Disk-based |
Yes (mmap) |
No (in-memory only) |
vs Faiss (C++/Python)
| Feature |
diskann-rs |
Faiss |
| Language |
Pure Rust |
C++/Python |
| Deployment |
Single binary |
Complex deps |
| Memory-mapped |
Native |
Limited |
| GPU support |
No |
Yes |
| Index types |
Vamana graph |
IVF, HNSW, PQ, etc. |
| Maturity |
Growing |
Production-proven |
When to choose diskann-rs: Pure Rust deployment, memory-constrained environments, need for incremental updates.
When to choose Faiss: GPU acceleration needed, need specialized index types, Python ecosystem.
License
MIT License - see LICENSE for details.
References