pub struct VectorIndex { /* private fields */ }Expand description
HNSW-based vector index for semantic search
Provides efficient approximate k-nearest neighbor search over high-dimensional vectors associated with content IDs.
Implementations§
Source§impl VectorIndex
impl VectorIndex
Sourcepub fn new(
dimension: usize,
metric: DistanceMetric,
max_nb_connection: usize,
ef_construction: usize,
) -> Result<Self>
pub fn new( dimension: usize, metric: DistanceMetric, max_nb_connection: usize, ef_construction: usize, ) -> Result<Self>
Create a new vector index with the specified dimension
§Arguments
dimension- Dimension of vectors to be indexedmetric- Distance metric to usemax_nb_connection- Maximum number of connections per layer (M parameter)ef_construction- Size of dynamic candidate list (efConstruction parameter)
Sourcepub fn with_defaults(dimension: usize) -> Result<Self>
pub fn with_defaults(dimension: usize) -> Result<Self>
Create a new index with default parameters
Uses M=16 and efConstruction=200, which are good defaults for most use cases
Sourcepub fn insert(&mut self, cid: &Cid, vector: &[f32]) -> Result<()>
pub fn insert(&mut self, cid: &Cid, vector: &[f32]) -> Result<()>
Insert a vector associated with a CID
§Arguments
cid- Content identifiervector- Feature vector to index
Sourcepub fn search(
&self,
query: &[f32],
k: usize,
ef_search: usize,
) -> Result<Vec<SearchResult>>
pub fn search( &self, query: &[f32], k: usize, ef_search: usize, ) -> Result<Vec<SearchResult>>
Search for k nearest neighbors
§Arguments
query- Query vectork- Number of neighbors to returnef_search- Size of dynamic candidate list during search (higher = more accurate but slower)
Sourcepub fn metric(&self) -> DistanceMetric
pub fn metric(&self) -> DistanceMetric
Get the distance metric used by this index
Sourcepub fn get_all_cids(&self) -> Vec<Cid> ⓘ
pub fn get_all_cids(&self) -> Vec<Cid> ⓘ
Get all CIDs in the index Useful for synchronization and snapshots
Sourcepub fn get_embedding(&self, cid: &Cid) -> Option<Vec<f32>>
pub fn get_embedding(&self, cid: &Cid) -> Option<Vec<f32>>
Get the embedding vector for a specific CID
Returns None if the CID is not in the index
Sourcepub fn get_all_embeddings(&self) -> Vec<(Cid, Vec<f32>)>
pub fn get_all_embeddings(&self) -> Vec<(Cid, Vec<f32>)>
Get all embeddings in the index as (CID, vector) pairs
Useful for iteration, migration, and batch operations
Sourcepub fn iter(&self) -> Vec<(Cid, Vec<f32>)>
pub fn iter(&self) -> Vec<(Cid, Vec<f32>)>
Iterate over all (CID, vector) pairs in the index
Returns an iterator over the embeddings
Sourcepub fn compute_optimal_parameters(&self) -> (usize, usize)
pub fn compute_optimal_parameters(&self) -> (usize, usize)
Compute optimal HNSW parameters based on current index size
Returns recommended (max_nb_connection, ef_construction) based on:
- Small indexes (< 10k): M=16, ef=200
- Medium indexes (10k-100k): M=32, ef=400
- Large indexes (> 100k): M=48, ef=600
Sourcepub fn compute_optimal_ef_search(&self, k: usize) -> usize
pub fn compute_optimal_ef_search(&self, k: usize) -> usize
Get recommended ef_search parameter based on k
Generally ef_search should be >= k and higher for better recall
Sourcepub fn get_parameter_recommendations(
&self,
use_case: UseCase,
) -> ParameterRecommendation
pub fn get_parameter_recommendations( &self, use_case: UseCase, ) -> ParameterRecommendation
Get detailed parameter recommendations based on use case
Sourcepub fn insert_batch(&mut self, items: &[(Cid, Vec<f32>)]) -> Result<()>
pub fn insert_batch(&mut self, items: &[(Cid, Vec<f32>)]) -> Result<()>
Insert multiple vectors in batch
More efficient than inserting one by one as it can use parallelization
§Arguments
items- Vector of (CID, vector) pairs to insert
Sourcepub fn insert_incremental(
&mut self,
items: &[(Cid, Vec<f32>)],
chunk_size: usize,
) -> Result<IncrementalBuildStats>
pub fn insert_incremental( &mut self, items: &[(Cid, Vec<f32>)], chunk_size: usize, ) -> Result<IncrementalBuildStats>
Insert vectors incrementally with periodic optimization
This method inserts vectors in chunks and tracks statistics to determine if index rebuild is beneficial. Returns statistics about the insertion.
§Arguments
items- Vector of (CID, vector) pairs to insertchunk_size- Number of vectors to insert before checking optimization
§Returns
Statistics about the incremental build process
Sourcepub fn should_rebuild(&self) -> bool
pub fn should_rebuild(&self) -> bool
Determine if index should be rebuilt for better performance
Rebuild is recommended when:
- Index has grown significantly (2x or more)
- Many deletions have occurred (fragmentation)
- Current parameters are suboptimal for index size
Sourcepub fn rebuild(&mut self, use_case: UseCase) -> Result<RebuildStats>
pub fn rebuild(&mut self, use_case: UseCase) -> Result<RebuildStats>
Rebuild the index with optimal parameters for current size
This creates a new index with better parameters and re-inserts all vectors.
Use this when should_rebuild() returns true.
§Arguments
use_case- Target use case for parameter selection
Sourcepub fn get_build_stats(&self) -> BuildHealthStats
pub fn get_build_stats(&self) -> BuildHealthStats
Get statistics about incremental build performance
Auto Trait Implementations§
impl Freeze for VectorIndex
impl RefUnwindSafe for VectorIndex
impl Send for VectorIndex
impl Sync for VectorIndex
impl Unpin for VectorIndex
impl UnwindSafe for VectorIndex
Blanket Implementations§
Source§impl<'a, T, E> AsTaggedExplicit<'a, E> for Twhere
T: 'a,
impl<'a, T, E> AsTaggedExplicit<'a, E> for Twhere
T: 'a,
Source§impl<'a, T, E> AsTaggedImplicit<'a, E> for Twhere
T: 'a,
impl<'a, T, E> AsTaggedImplicit<'a, E> for Twhere
T: 'a,
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.