pub struct WarpIndex { /* private fields */ }Expand description
WARP index for efficient multi-vector retrieval.
The index organizes token embeddings by centroid assignment (IVF structure) for cache-efficient access during search. Each token embedding is stored as:
- Centroid assignment
- Quantized residual (2-4 bits per dimension)
§Lifecycle
- Create index with
new(config) - Train codec with
train(samples) - Insert documents with
insert(chunk, embedding) - Build index with
build()(compacts for efficient search) - Search with
search(query, config)
§Memory Layout
After build(), data is organized by centroid:
Centroid 0: [chunk_ids...] [token_indices...] [residuals...]
Centroid 1: [chunk_ids...] [token_indices...] [residuals...]
...Implementations§
Source§impl WarpIndex
impl WarpIndex
Sourcepub fn new(config: WarpIndexConfig) -> Self
pub fn new(config: WarpIndexConfig) -> Self
Create a new WARP index with the given configuration.
Sourcepub fn config(&self) -> &WarpIndexConfig
pub fn config(&self) -> &WarpIndexConfig
Get the index configuration.
Sourcepub fn codec(&self) -> Option<&ResidualCodec>
pub fn codec(&self) -> Option<&ResidualCodec>
Get the trained codec (if any).
Sourcepub fn is_trained(&self) -> bool
pub fn is_trained(&self) -> bool
Check if the codec has been trained.
Sourcepub fn num_chunks(&self) -> usize
pub fn num_chunks(&self) -> usize
Get the number of indexed chunks.
Sourcepub fn num_tokens(&self) -> usize
pub fn num_tokens(&self) -> usize
Get the number of indexed tokens.
Sourcepub fn memory_usage(&self) -> usize
pub fn memory_usage(&self) -> usize
Get memory usage in bytes (approximate).
Sourcepub fn train(&mut self, samples: &[MultiVectorEmbedding]) -> Result<()>
pub fn train(&mut self, samples: &[MultiVectorEmbedding]) -> Result<()>
Sourcepub fn insert(
&mut self,
chunk: Chunk,
embedding: MultiVectorEmbedding,
) -> Result<()>
pub fn insert( &mut self, chunk: Chunk, embedding: MultiVectorEmbedding, ) -> Result<()>
Insert a chunk with its token embeddings.
The chunk will be stored in pending state until build() is called.
§Errors
Returns an error if:
- Codec has not been trained
- Index has already been built (call
rebuild()first)
Sourcepub fn build(&mut self) -> Result<()>
pub fn build(&mut self) -> Result<()>
Build the index for efficient search.
This compacts all pending embeddings into a centroid-organized IVF structure optimized for cache-efficient search.
§Errors
Returns an error if the codec has not been trained.
Sourcepub fn clear_index(&mut self)
pub fn clear_index(&mut self)
Clear the built index to allow new insertions.
Chunks are preserved, but the IVF structure is cleared.
Call build() again after inserting new chunks.
Sourcepub fn search(
&self,
query: &MultiVectorEmbedding,
search_config: &WarpSearchConfig,
) -> Result<Vec<(ChunkId, f32)>>
pub fn search( &self, query: &MultiVectorEmbedding, search_config: &WarpSearchConfig, ) -> Result<Vec<(ChunkId, f32)>>
Sourcepub fn centroid_size(&self, centroid_id: usize) -> usize
pub fn centroid_size(&self, centroid_id: usize) -> usize
Get centroid size (number of tokens assigned).
Sourcepub fn centroid_offset(&self, centroid_id: usize) -> usize
pub fn centroid_offset(&self, centroid_id: usize) -> usize
Get centroid offset in the compacted arrays.
Trait Implementations§
Source§impl<'de> Deserialize<'de> for WarpIndex
impl<'de> Deserialize<'de> for WarpIndex
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Auto Trait Implementations§
impl Freeze for WarpIndex
impl RefUnwindSafe for WarpIndex
impl Send for WarpIndex
impl Sync for WarpIndex
impl Unpin for WarpIndex
impl UnsafeUnpin for WarpIndex
impl UnwindSafe for WarpIndex
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> DeserializeOwned for Twhere
T: for<'de> Deserialize<'de>,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more