Expand description
DiskANN / Vamana SSD-Backed Approximate Nearest Neighbor Index
Implements the Vamana graph index from the DiskANN paper (Subramanya et al., 2019). Each node connects to R neighbors chosen via alpha-RNG pruning – a relaxed Relative Neighborhood Graph balancing proximity and angular diversity.
§Why DiskANN achieves 95%+ recall at sub-10ms
- Vamana graph: alpha > 1.0 retains long-range shortcuts for O(log n) hops.
- SSD layout: node vector + neighbors packed in aligned pages; one read per hop.
- Page cache: LRU cache keeps hot pages in memory (80-95% hit rates typical).
- Filtered traversal: predicates evaluated during search, not post-filter.
§Alpha-RNG Pruning
A candidate c is kept only if for every already-selected neighbor n,
dist(p, c) <= alpha * dist(n, c), ensuring angular diversity.
Structs§
- Disk
Index - Simulated SSD-backed index with page-aligned reads and LRU cache.
- Disk
Node - A node stored in SSD-backed layout: id + neighbors + vector in one page.
- IOStats
- IO statistics for disk-based search.
- Medoid
Finder - Finds the geometric medoid (point minimising sum of distances to all others).
- Page
Cache - LRU page cache tracking access recency via a clock counter.
- Vamana
Config - Configuration for the Vamana graph index.
- Vamana
Graph - In-memory Vamana graph for building and searching.