pub struct HNSWIndex { /* private fields */ }Expand description
HNSW index for approximate nearest neighbor search.
Implements the Hierarchical Navigable Small World algorithm (Malkov & Yashunin, 2016) with optimizations for SIMD acceleration and cache efficiency.
Important: This index uses cosine distance (1 - dot(a, b)) which requires
L2-normalized input vectors. Un-normalized vectors produce silently wrong results.
Use crate::distance::normalize before adding vectors.
§Scalability Note (Wilson Lin’s 3B Embedding Insight)
Standard in-memory HNSW (like this implementation) becomes cost-prohibitive at billion-scale (requires TBs of RAM).
Future Optimization (CoreNN approach):
- Move vector storage and graph structure to disk (SSD).
- Support live updates without full rebuilds.
- Use sharding (e.g., 64 shards by xxHash) to distribute load.
Implementations§
Source§impl HNSWIndex
impl HNSWIndex
Sourcepub fn builder(dimension: usize) -> HNSWBuilder
pub fn builder(dimension: usize) -> HNSWBuilder
Create a builder for configuring an HNSW index.
Sourcepub fn with_params(
dimension: usize,
params: HNSWParams,
) -> Result<Self, RetrieveError>
pub fn with_params( dimension: usize, params: HNSWParams, ) -> Result<Self, RetrieveError>
Create with custom parameters.
Sourcepub fn with_filtering(
dimension: usize,
m: usize,
m_max: usize,
filter_field: impl Into<String>,
) -> Result<Self, RetrieveError>
pub fn with_filtering( dimension: usize, m: usize, m_max: usize, filter_field: impl Into<String>, ) -> Result<Self, RetrieveError>
Create a new HNSW index with filtering support.
§Arguments
dimension- Vector dimensionm- Maximum connections per nodem_max- Maximum connections for new nodesfilter_field- Field name for filtering (e.g., “category”)
Sourcepub fn entry_point(&self) -> Option<(u32, usize)>
pub fn entry_point(&self) -> Option<(u32, usize)>
Return the cached entry point (node with highest layer assignment).
Available after build(). Returns None if the index is empty or not built.
Sourcepub fn raw_vectors(&self) -> &[f32]
pub fn raw_vectors(&self) -> &[f32]
Borrow the raw flat vector storage.
Layout: [v0[0..d], v1[0..d], ..., vn[0..d]] where d = dimension.
Useful for building external structures (PRT projections, quantization)
that need access to the stored vectors.
Sourcepub fn max_node_degree(&self) -> (usize, u32, usize)
pub fn max_node_degree(&self) -> (usize, u32, usize)
Return the maximum neighbor count across all nodes and layers.
Useful for verifying the graph respects the degree bound after construction.
Returns (layer_index, node_id, degree) for the node with the most neighbors.
Sourcepub fn delete(&mut self, doc_id: u32) -> Result<(), RetrieveError>
pub fn delete(&mut self, doc_id: u32) -> Result<(), RetrieveError>
Mark a vector as deleted. Deleted vectors are excluded from search results.
The vector’s storage is not reclaimed until the index is rebuilt. Graph edges from/to deleted nodes remain intact for navigation; deleted nodes are filtered from final results only.
Sourcepub fn delete_with_repair(
&mut self,
doc_id: u32,
) -> Result<usize, RetrieveError>
pub fn delete_with_repair( &mut self, doc_id: u32, ) -> Result<usize, RetrieveError>
Delete a vector with Wolverine++ graph repair.
Unlike HNSWIndex::delete (which only tombstones), this method repairs the graph:
- Finds all in-neighbors (nodes pointing to the deleted node) on each layer
- Removes edges to the deleted node
- For each in-neighbor, finds a replacement via 2-hop crescent-locus filtering
- Updates the entry point if the deleted node was the entry
The crescent-locus filter (Wolverine, VLDB 2025) selects repair candidates that are closer to the in-neighbor than the deleted node was, and far from the deleted node itself. This produces edges that survive diversity pruning and maintain monotonic reachability.
Returns the number of repair edges added. Requires uncompressed neighbor storage.
Sourcepub fn delete_batch_with_repair(
&mut self,
doc_ids: &[u32],
) -> Result<usize, RetrieveError>
pub fn delete_batch_with_repair( &mut self, doc_ids: &[u32], ) -> Result<usize, RetrieveError>
Batch delete with Wolverine++ repair.
More efficient than calling delete_with_repair in a loop because
in-neighbor scanning is done once per layer for all deleted nodes.
Sourcepub fn is_deleted(&self, doc_id: u32) -> bool
pub fn is_deleted(&self, doc_id: u32) -> bool
Check if a doc_id has been deleted.
Sourcepub fn num_active(&self) -> usize
pub fn num_active(&self) -> usize
Number of active (non-deleted) vectors.
Sourcepub fn memory_usage(&self) -> MemoryReport
pub fn memory_usage(&self) -> MemoryReport
Memory usage breakdown for this index.
Sourcepub fn add_metadata(
&mut self,
doc_id: u32,
metadata: DocumentMetadata,
) -> Result<(), RetrieveError>
pub fn add_metadata( &mut self, doc_id: u32, metadata: DocumentMetadata, ) -> Result<(), RetrieveError>
Add metadata for a document (required for filtering).
Sourcepub fn add(
&mut self,
doc_id: u32,
vector: Vec<f32>,
) -> Result<(), RetrieveError>
pub fn add( &mut self, doc_id: u32, vector: Vec<f32>, ) -> Result<(), RetrieveError>
Add a vector to the index.
Vectors should be L2-normalized for cosine similarity. Index must be built before searching.
Sourcepub fn add_slice(
&mut self,
doc_id: u32,
vector: &[f32],
) -> Result<(), RetrieveError>
pub fn add_slice( &mut self, doc_id: u32, vector: &[f32], ) -> Result<(), RetrieveError>
Add a vector to the index from a borrowed slice.
§Errors
Returns RetrieveError::InvalidParameter if:
- The vector is not L2-normalized (
norm^2outside[0.9, 1.1]), unlessauto_normalizeis enabled on the builder. - The
doc_idis a duplicate. - The index has already been built.
Sourcepub fn add_batch(
&mut self,
ids: &[u32],
vectors: &[f32],
) -> Result<(), RetrieveError>
pub fn add_batch( &mut self, ids: &[u32], vectors: &[f32], ) -> Result<(), RetrieveError>
Add multiple vectors in bulk from a flat f32 slice.
ids and vectors must be aligned: vectors.len() == ids.len() * self.dimension.
Each contiguous dimension-sized chunk in vectors corresponds to the ID at the same
position in ids.
Sourcepub fn build(&mut self) -> Result<(), RetrieveError>
pub fn build(&mut self) -> Result<(), RetrieveError>
Build the index (required before search).
Constructs the multi-layer graph structure.
Sourcepub fn search(
&self,
query: &[f32],
k: usize,
ef: usize,
) -> Result<Vec<(u32, f32)>, RetrieveError>
pub fn search( &self, query: &[f32], k: usize, ef: usize, ) -> Result<Vec<(u32, f32)>, RetrieveError>
Sourcepub fn search_with_distance<F: Fn(&[f32], u32) -> f32>(
&self,
query: &[f32],
k: usize,
ef: usize,
dist_fn: &F,
) -> Result<Vec<(u32, f32)>, RetrieveError>
pub fn search_with_distance<F: Fn(&[f32], u32) -> f32>( &self, query: &[f32], k: usize, ef: usize, dist_fn: &F, ) -> Result<Vec<(u32, f32)>, RetrieveError>
Search with a caller-provided distance function.
The graph structure (built with center-to-center distance) is used for
navigation, but the provided dist_fn is called for every distance
computation during search. The closure receives (query, internal_id)
and must return a distance value.
This enables asymmetric search: build the graph with point distances for navigability, search with box-to-point, quantized, or any other distance function.
The internal_id passed to the closure is the zero-based insertion
order index (same as the index into flat vector storage). The caller
must maintain a parallel array mapping these IDs to their custom data
(box geometry, quantization codes, etc.).
Returns (doc_id, distance) pairs sorted by distance ascending.
Sourcepub fn batch_search_mqo(
&self,
queries: &[&[f32]],
k: usize,
ef_search: usize,
) -> Result<Vec<Vec<(u32, f32)>>, RetrieveError>
pub fn batch_search_mqo( &self, queries: &[&[f32]], k: usize, ef_search: usize, ) -> Result<Vec<Vec<(u32, f32)>>, RetrieveError>
Search a batch of queries using Multiple Query Optimization (MQO).
Instead of routing every query independently from the global HNSW entry point, queries are processed in a greedy-nearest-neighbour chain order: the first query uses the normal entry point, and each subsequent query inherits the best result node from the nearest already-processed query as an additional warm-start seed.
This reuses graph locality between nearby queries, reducing the effective search depth for similar queries and improving cache utilisation.
Results are returned in the same order as the input queries.
If the batch contains only one query, this is equivalent to calling
search directly.
§Arguments
queries- Slice of query vectors; each must have lengthdimension.k- Number of nearest neighbours to return per query.ef_search- Beam width during base-layer search (higher = better recall).
§Errors
Returns RetrieveError if the index has not been built, any query has the
wrong dimension, or the index is empty.
Sourcepub fn search_with_filter(
&self,
query: &[f32],
k: usize,
ef: usize,
filter: &MetadataFilter,
) -> Result<Vec<(u32, f32)>, RetrieveError>
pub fn search_with_filter( &self, query: &[f32], k: usize, ef: usize, filter: &MetadataFilter, ) -> Result<Vec<(u32, f32)>, RetrieveError>
Search with filter using filterable graph (integrated filtering).
Uses intra-category edges to maintain graph connectivity during filtered search. Only explores neighbors that match the filter predicate.
§Arguments
query- Query vector (should be L2-normalized)k- Number of neighbors to returnef- Search width (higher = better recall, slower)filter- Filter predicate (must be equality filter on filter_field)
§Returns
Vector of (doc_id, distance) pairs matching the filter, sorted by distance ascending
Sourcepub fn search_adaptive(
&self,
query: &[f32],
k: usize,
ef: usize,
config: &AdaptiveConfig,
) -> Result<(Vec<(u32, f32)>, usize), RetrieveError>
pub fn search_adaptive( &self, query: &[f32], k: usize, ef: usize, config: &AdaptiveConfig, ) -> Result<(Vec<(u32, f32)>, usize), RetrieveError>
Search with adaptive early termination.
Behaves like HNSWIndex::search but uses an EarlyTerminationOracle on the
base layer to skip distance computations once the oracle is confident
the top-k has converged. Upper layers use the standard greedy search
(they visit few nodes and don’t benefit from early termination).
Returns Ok((results, num_evaluated)) where num_evaluated is the
number of distance computations performed on the base layer.
Sourcepub fn search_prt(
&self,
query: &[f32],
k: usize,
ef: usize,
prt: &ProbabilisticRoutingTest,
initial_ratio: f32,
decay: f32,
) -> Result<(Vec<(u32, f32)>, usize), RetrieveError>
pub fn search_prt( &self, query: &[f32], k: usize, ef: usize, prt: &ProbabilisticRoutingTest, initial_ratio: f32, decay: f32, ) -> Result<(Vec<(u32, f32)>, usize), RetrieveError>
Search with Probabilistic Routing Test (PRT) pre-filtering.
Uses random subspace projections to cheaply estimate distances before computing full distances, skipping candidates unlikely to improve results. The Test Feedback Buffer (TFB) adaptively tightens the filter threshold during the search to reduce false positives.
Returns Ok((results, full_distance_count)) where full_distance_count
is the number of full (O(d)) distance computations performed. Compare
against search() to measure the PRT savings.
prt must have been initialized with project_database(&self.vectors).
Sourcepub fn vectors_raw(&self) -> &[f32]
pub fn vectors_raw(&self) -> &[f32]
Raw vector storage in internal (reordered) layout.
After build(), vectors are reordered for cache locality (BFS order).
The internal node IDs used by search_with_distance’s closure correspond
to positions in this array, NOT to the original insertion order.
Use this to build auxiliary data structures (ADSampling rotation, quantization codes) that are indexed by internal node ID.
Prefer raw_vectors(); this is a legacy alias.