pub struct SearchIndex { /* private fields */ }Expand description
Inverted word index for fast entity search.
For each entity, we tokenize its name, type, and observations, store each token → set of matching entity indices.
Uses a flat Vec<(StrId, u32)> sorted by (token, entity_idx)
for cache-friendly lookups via binary search.
Implementations§
Source§impl SearchIndex
impl SearchIndex
pub fn new() -> Self
pub fn clear(&mut self)
pub const fn len(&self) -> usize
pub const fn is_empty(&self) -> bool
Sourcepub fn index_entity(
&mut self,
interner: &mut StringInterner,
entity_idx: u32,
name: StrId,
entity_type: StrId,
observations: &[StrId],
)
pub fn index_entity( &mut self, interner: &mut StringInterner, entity_idx: u32, name: StrId, entity_type: StrId, observations: &[StrId], )
Index a single entity by its name, type, and observations.
All strings must already be interned.
entity_idx is the position in the entity storage vec.
Sourcepub fn index_additional(
&mut self,
interner: &mut StringInterner,
entity_idx: u32,
texts: &[StrId],
)
pub fn index_additional( &mut self, interner: &mut StringInterner, entity_idx: u32, texts: &[StrId], )
Incrementally index additional strings (e.g. newly added observations) for an entity that is already indexed, without removing and rebuilding its existing entries (P3). Token entries that already exist are deduped during the merge, so calling this with text that overlaps existing tokens is safe.
Sourcepub fn remove_entity(&mut self, entity_idx: u32)
pub fn remove_entity(&mut self, entity_idx: u32)
Remove all entries for a given entity (before re-indexing).
Sourcepub fn search(&self, query: &str, interner: &StringInterner) -> Vec<u32>
pub fn search(&self, query: &str, interner: &StringInterner) -> Vec<u32>
Search for entities whose name/type/observation tokens match query
case-insensitively by prefix ("cof" matches "coffee").
Note: this is an O(n) scan over every index entry. The binary-search step below only narrows exact-token hits, but the subsequent prefix scan already covers those (an exact match is also a prefix match), so the scan dominates — do not read the binary search as making this sublinear.
Sourcepub fn search_ranked(
&self,
query: &str,
interner: &StringInterner,
) -> Vec<(u32, u32)>
pub fn search_ranked( &self, query: &str, interner: &StringInterner, ) -> Vec<(u32, u32)>
Like [search], but returns (entity_idx, score) pairs sorted by
descending score (then ascending idx for stability). score is the
number of indexed-token hits the entity accumulated for the query —
a cheap relevance proxy so callers can surface the best matches first.
The scan is a single linear pass over the flat entries vec (no
per-entity allocation until the final compaction), keeping it
cache-friendly. A small Vec<(idx, score)> is gathered then sorted.
Trait Implementations§
Source§impl Clone for SearchIndex
impl Clone for SearchIndex
Source§fn clone(&self) -> SearchIndex
fn clone(&self) -> SearchIndex
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more