pub struct SparseInvertedIndex { /* private fields */ }Expand description
A sparse inverted index over SparseEmbed values.
Build incrementally via Self::new + Self::add, or in bulk
via Self::build_from_repo. Query via Self::search.
Posting lists are stored as HashMap<u32 token_id, Vec<Posting>>,
where every Vec<Posting> is sorted by NodeId ASC for
deterministic tie-break behaviour matching the rest of mnem-core’s
indexes.
Implementations§
Source§impl SparseInvertedIndex
impl SparseInvertedIndex
Sourcepub fn new(vocab_id: impl Into<String>) -> Self
pub fn new(vocab_id: impl Into<String>) -> Self
Construct an empty index bound to vocab_id. Nodes added
via Self::add whose own vocab_id disagrees are silently
skipped - mirrors BruteForceVectorIndex
behaviour for cross-model documents.
Sourcepub fn add(&mut self, node: NodeId, embed: &SparseEmbed)
pub fn add(&mut self, node: NodeId, embed: &SparseEmbed)
Feed one (node, sparse_embed) pair. Silently skips when the
embed’s vocab_id disagrees with the index’s or when the
embed has zero non-zero entries.
Sourcepub fn finalize(&mut self)
pub fn finalize(&mut self)
Finalise the index: sort each posting list by NodeId ASC so
search results tie-break deterministically. Call once after
all add() calls; idempotent.
Sourcepub fn search(
&self,
query: &SparseEmbed,
k: usize,
) -> Result<Vec<VectorHit>, Error>
pub fn search( &self, query: &SparseEmbed, k: usize, ) -> Result<Vec<VectorHit>, Error>
Search the index for the top-k documents by sparse-dot-product
score against query. Returns VectorHit (same shape as the
dense index so callers can fuse results without a custom type).
On vocab_id mismatch returns an empty vec - the caller
receives no scores to fuse, same semantics as a disjoint
vocabulary.
Sourcepub fn build_from_repo(
repo: &ReadonlyRepo,
vocab_id: impl Into<String>,
) -> Result<Self, Error>
pub fn build_from_repo( repo: &ReadonlyRepo, vocab_id: impl Into<String>, ) -> Result<Self, Error>
Build an index from all nodes in the current commit whose
sparse_embed field matches vocab_id. Requires the nodes to
have been indexed by an adapter at write time.
§Errors
RepoError::Uninitializedif the repo has no head commit.- Store / codec errors while walking the Prolly tree.
Trait Implementations§
Source§impl Clone for SparseInvertedIndex
impl Clone for SparseInvertedIndex
Source§fn clone(&self) -> SparseInvertedIndex
fn clone(&self) -> SparseInvertedIndex
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more