pub struct SearchIndex {
pub chunks: Vec<CodeChunk>,
pub hidden_dim: usize,
/* private fields */
}Expand description
Pre-computed embedding matrix for fast re-ranking.
Stores all chunk embeddings as a contiguous [num_chunks, hidden_dim]
ndarray matrix. Re-ranking is a single BLAS matrix-vector multiply.
When constructed with a cascade_dim, also stores a truncated and
re-normalized [num_chunks, cascade_dim] matrix for two-phase MRL
cascade search: fast pre-filter at reduced dimension, then full-dim
re-rank of the top candidates.
Fields§
§chunks: Vec<CodeChunk>All chunks with metadata.
Hidden dimension size.
Implementations§
Source§impl SearchIndex
impl SearchIndex
Sourcepub fn new(
chunks: Vec<CodeChunk>,
raw_embeddings: &[Vec<f32>],
cascade_dim: Option<usize>,
) -> Self
pub fn new( chunks: Vec<CodeChunk>, raw_embeddings: &[Vec<f32>], cascade_dim: Option<usize>, ) -> Self
Build an index from embed_all output.
Flattens the per-chunk embedding vectors into a contiguous Array2
for BLAS-accelerated matrix-vector products at query time.
When cascade_dim is Some(d), also builds a truncated and
L2-re-normalized [N, d] matrix for two-phase MRL cascade search.
The truncated dimension is clamped to hidden_dim.
§Panics
Panics if the flattened embedding data cannot form a valid
[num_chunks, hidden_dim] matrix (should never happen when
embeddings come from embed_all).
Sourcepub fn rank(&self, query_embedding: &[f32], threshold: f32) -> Vec<(usize, f32)>
pub fn rank(&self, query_embedding: &[f32], threshold: f32) -> Vec<(usize, f32)>
Rank all chunks against a query embedding.
Returns (chunk_index, similarity_score) pairs sorted by descending
score, filtered by threshold.
Sourcepub fn rank_turboquant(
&self,
query_embedding: &[f32],
top_k: usize,
threshold: f32,
) -> Vec<(usize, f32)>
pub fn rank_turboquant( &self, query_embedding: &[f32], top_k: usize, threshold: f32, ) -> Vec<(usize, f32)>
TurboQuant-accelerated ranking: compressed approximate scan → exact re-rank.
- Estimate inner products for ALL vectors via
TurboQuant(~5× faster than BLAS). - Take top
pre_filter_kapproximate candidates. - Re-rank with exact FP32 dot products on the full embedding matrix.
Falls back to Self::rank when no compressed index is available.
Sourcepub fn rank_cascade(
&self,
query_embedding: &[f32],
top_k: usize,
threshold: f32,
) -> Vec<(usize, f32)>
pub fn rank_cascade( &self, query_embedding: &[f32], top_k: usize, threshold: f32, ) -> Vec<(usize, f32)>
Two-phase MRL cascade ranking: fast pre-filter then full re-rank.
- Layer-norms the query over its full dimension, truncates to
truncated_dim, L2-normalizes, and computes dot products against the truncated matrix to find the toppre_filter_kcandidates. - Re-ranks those candidates using full-dimension dot products.
Falls back to Self::rank when no truncated matrix is available.
Sourcepub fn embedding(&self, idx: usize) -> Option<Vec<f32>>
pub fn embedding(&self, idx: usize) -> Option<Vec<f32>>
Return a clone of the embedding vector for chunk idx.
Returns None if idx is out of bounds.
Sourcepub fn find_duplicates(
&self,
threshold: f32,
max_pairs: usize,
) -> Vec<(usize, usize, f32)>
pub fn find_duplicates( &self, threshold: f32, max_pairs: usize, ) -> Vec<(usize, usize, f32)>
Find duplicate or near-duplicate chunks by pairwise cosine similarity.
Computes embeddings @ embeddings.T (a single BLAS GEMM) to get all
pairwise similarities, then extracts pairs above threshold from the
upper triangle (avoiding self-matches and symmetric duplicates).
Returns (chunk_a, chunk_b, similarity) sorted by descending similarity.
Each pair appears only once (a < b).
Sourcepub fn truncated_dim(&self) -> Option<usize>
pub fn truncated_dim(&self) -> Option<usize>
The truncated dimension used for cascade pre-filtering, if enabled.
Auto Trait Implementations§
impl Freeze for SearchIndex
impl RefUnwindSafe for SearchIndex
impl Send for SearchIndex
impl Sync for SearchIndex
impl Unpin for SearchIndex
impl UnsafeUnpin for SearchIndex
impl UnwindSafe for SearchIndex
Blanket Implementations§
Source§impl<T> ArchivePointee for T
impl<T> ArchivePointee for T
Source§type ArchivedMetadata = ()
type ArchivedMetadata = ()
Source§fn pointer_metadata(
_: &<T as ArchivePointee>::ArchivedMetadata,
) -> <T as Pointee>::Metadata
fn pointer_metadata( _: &<T as ArchivePointee>::ArchivedMetadata, ) -> <T as Pointee>::Metadata
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Downcast for Twhere
T: Any,
impl<T> Downcast for Twhere
T: Any,
Source§fn into_any(self: Box<T>) -> Box<dyn Any>
fn into_any(self: Box<T>) -> Box<dyn Any>
Box<dyn Trait> (where Trait: Downcast) to Box<dyn Any>, which can then be
downcast into Box<dyn ConcreteType> where ConcreteType implements Trait.Source§fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
Rc<Trait> (where Trait: Downcast) to Rc<Any>, which can then be further
downcast into Rc<ConcreteType> where ConcreteType implements Trait.Source§fn as_any(&self) -> &(dyn Any + 'static)
fn as_any(&self) -> &(dyn Any + 'static)
&Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot
generate &Any’s vtable from &Trait’s.Source§fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
&mut Trait (where Trait: Downcast) to &Any. This is needed since Rust cannot
generate &mut Any’s vtable from &mut Trait’s.Source§impl<T> DowncastSend for T
impl<T> DowncastSend for T
Source§impl<T> DowncastSync for T
impl<T> DowncastSync for T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> LayoutRaw for T
impl<T> LayoutRaw for T
Source§fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>
fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>
Source§impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
Source§unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool
unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool
Source§fn resolve_niched(out: Place<NichedOption<T, N1>>)
fn resolve_niched(out: Place<NichedOption<T, N1>>)
out indicating that a T is niched.