pub struct InvertedIndex { /* private fields */ }Expand description
Inverted index for lexical search
Implementations§
Source§impl InvertedIndex
impl InvertedIndex
Sourcepub fn new(config: BM25Config) -> Self
pub fn new(config: BM25Config) -> Self
Create a new inverted index
Sourcepub fn with_positions(self) -> Self
pub fn with_positions(self) -> Self
Enable position storage (for phrase queries)
Sourcepub fn add_document(&self, text: &str) -> DocId
pub fn add_document(&self, text: &str) -> DocId
Index a document
Returns the assigned document ID.
Sourcepub fn add_document_with_id(&self, doc_id: DocId, text: &str)
pub fn add_document_with_id(&self, doc_id: DocId, text: &str)
Index a document with specific ID
Sourcepub fn clear(&self)
pub fn clear(&self)
Remove every document, returning the index to its freshly-created state while preserving the BM25 configuration and position-storage setting.
Sourcepub fn rebuild_from_documents<'a, I>(&self, documents: I)
pub fn rebuild_from_documents<'a, I>(&self, documents: I)
Rebuild the entire index from an authoritative (doc_id, text) source.
§Durability contract (Task 7)
This index is an in-memory derived structure: it holds no WAL and is not itself crash-durable. The committed document store is the source of truth, and lexical search agrees with it only up to the last rebuild. The supported recovery model is therefore:
- Documents commit through the durable storage path (WAL/MVCC).
- On restart, the lexical index is reconstructed from the committed
document store via this method — an O(corpus) bounded, deterministic
pass (it
clears first, so the result is a pure function of the input, independent of any prior in-memory state). - Only then is lexical search served.
Because scores are order-independent (IDF is derived from (df, N) and
avgdl from running totals, never cached), a rebuilt index is byte-for-byte
equivalent in ranking to the index that produced the documents, so a
query returns the same result set across a crash/restart boundary
relative to one committed snapshot.
Sourcepub fn add_document_tokens(&self, tokens: &[String]) -> DocId
pub fn add_document_tokens(&self, tokens: &[String]) -> DocId
Index a document from tokens
Sourcepub fn add_document_tokens_with_id(&self, doc_id: DocId, tokens: &[String])
pub fn add_document_tokens_with_id(&self, doc_id: DocId, tokens: &[String])
Index a document from tokens with specific ID
Sourcepub fn remove_document(&self, doc_id: DocId) -> bool
pub fn remove_document(&self, doc_id: DocId) -> bool
Remove a document from the index
Sourcepub fn search(&self, query: &str, limit: usize) -> Vec<(DocId, f32)>
pub fn search(&self, query: &str, limit: usize) -> Vec<(DocId, f32)>
Search the index
Returns document IDs with scores, sorted by score descending.
Sourcepub fn search_tokens(
&self,
query_tokens: &[String],
limit: usize,
) -> Vec<(DocId, f32)>
pub fn search_tokens( &self, query_tokens: &[String], limit: usize, ) -> Vec<(DocId, f32)>
Search with pre-tokenized query
Sourcepub fn get_posting_list(&self, term: &str) -> Option<PostingList>
pub fn get_posting_list(&self, term: &str) -> Option<PostingList>
Get posting list for a term
Sourcepub fn num_documents(&self) -> usize
pub fn num_documents(&self) -> usize
Get document count
Sourcepub fn vocab_size(&self) -> usize
pub fn vocab_size(&self) -> usize
Get vocabulary size
Sourcepub fn get_document_info(&self, doc_id: DocId) -> Option<DocumentInfo>
pub fn get_document_info(&self, doc_id: DocId) -> Option<DocumentInfo>
Get document info
Sourcepub fn has_document(&self, doc_id: DocId) -> bool
pub fn has_document(&self, doc_id: DocId) -> bool
Check if a document exists
Auto Trait Implementations§
impl !Freeze for InvertedIndex
impl !RefUnwindSafe for InvertedIndex
impl Send for InvertedIndex
impl Sync for InvertedIndex
impl Unpin for InvertedIndex
impl UnsafeUnpin for InvertedIndex
impl UnwindSafe for InvertedIndex
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more