Skip to main content

WarpIndex

Struct WarpIndex 

Source
pub struct WarpIndex { /* private fields */ }
Expand description

WARP index for efficient multi-vector retrieval.

The index organizes token embeddings by centroid assignment (IVF structure) for cache-efficient access during search. Each token embedding is stored as:

  • Centroid assignment
  • Quantized residual (2-4 bits per dimension)

§Lifecycle

  1. Create index with new(config)
  2. Train codec with train(samples)
  3. Insert documents with insert(chunk, embedding)
  4. Build index with build() (compacts for efficient search)
  5. Search with search(query, config)

§Memory Layout

After build(), data is organized by centroid:

Centroid 0: [chunk_ids...] [token_indices...] [residuals...]
Centroid 1: [chunk_ids...] [token_indices...] [residuals...]
...

Implementations§

Source§

impl WarpIndex

Source

pub fn new(config: WarpIndexConfig) -> Self

Create a new WARP index with the given configuration.

Source

pub fn config(&self) -> &WarpIndexConfig

Get the index configuration.

Source

pub fn codec(&self) -> Option<&ResidualCodec>

Get the trained codec (if any).

Source

pub fn is_trained(&self) -> bool

Check if the codec has been trained.

Source

pub fn is_built(&self) -> bool

Check if the index has been built.

Source

pub fn num_chunks(&self) -> usize

Get the number of indexed chunks.

Source

pub fn num_tokens(&self) -> usize

Get the number of indexed tokens.

Source

pub fn is_empty(&self) -> bool

Check if the index is empty.

Source

pub fn get_chunk(&self, id: &ChunkId) -> Option<&Chunk>

Get a chunk by ID.

Source

pub fn memory_usage(&self) -> usize

Get memory usage in bytes (approximate).

Source

pub fn train(&mut self, samples: &[MultiVectorEmbedding]) -> Result<()>

Train the codec from sample embeddings.

§Arguments
  • samples - Sample multi-vector embeddings for training
§Errors

Returns an error if:

  • Not enough samples for training
  • Configuration is invalid
Source

pub fn insert( &mut self, chunk: Chunk, embedding: MultiVectorEmbedding, ) -> Result<()>

Insert a chunk with its token embeddings.

The chunk will be stored in pending state until build() is called.

§Errors

Returns an error if:

  • Codec has not been trained
  • Index has already been built (call rebuild() first)
Source

pub fn build(&mut self) -> Result<()>

Build the index for efficient search.

This compacts all pending embeddings into a centroid-organized IVF structure optimized for cache-efficient search.

§Errors

Returns an error if the codec has not been trained.

Source

pub fn clear_index(&mut self)

Clear the built index to allow new insertions.

Chunks are preserved, but the IVF structure is cleared. Call build() again after inserting new chunks.

Source

pub fn search( &self, query: &MultiVectorEmbedding, search_config: &WarpSearchConfig, ) -> Result<Vec<(ChunkId, f32)>>

Search for relevant chunks using MaxSim scoring.

§Arguments
  • query - Query multi-vector embedding
  • search_config - Search parameters
§Returns

Vector of (ChunkId, score) pairs sorted by score descending.

§Errors

Returns an error if the index has not been built.

Source

pub fn centroid_size(&self, centroid_id: usize) -> usize

Get centroid size (number of tokens assigned).

Source

pub fn centroid_offset(&self, centroid_id: usize) -> usize

Get centroid offset in the compacted arrays.

Trait Implementations§

Source§

impl Clone for WarpIndex

Source§

fn clone(&self) -> WarpIndex

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for WarpIndex

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'de> Deserialize<'de> for WarpIndex

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl Serialize for WarpIndex

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> Downcast<T> for T

Source§

fn downcast(&self) -> &T

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<F, T> IntoSample<T> for F
where T: FromSample<F>,

Source§

fn into_sample(self) -> T

Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> Upcast<T> for T

Source§

fn upcast(&self) -> Option<&T>

Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,

Source§

impl<T> WasmNotSend for T
where T: Send,

Source§

impl<T> WasmNotSendSync for T

Source§

impl<T> WasmNotSync for T
where T: Sync,