pub struct BM25Scorer { /* private fields */ }Expand description
BM25 scorer for a document collection
Implementations§
Source§impl BM25Scorer
impl BM25Scorer
Sourcepub fn new(config: BM25Config) -> Self
pub fn new(config: BM25Config) -> Self
Create a new BM25 scorer
Sourcepub fn build<I, D, T>(documents: I, config: BM25Config) -> Self
pub fn build<I, D, T>(documents: I, config: BM25Config) -> Self
Build the scorer from a collection of documents
Sourcepub fn avg_doc_len(&self) -> f32
pub fn avg_doc_len(&self) -> f32
Average document length, derived from running totals.
Computed on read so it can never drift out of sync with the corpus (avgdl changes for every term on every insert and delete).
Sourcepub fn config(&self) -> BM25Config
pub fn config(&self) -> BM25Config
The scoring configuration this scorer was built with.
Sourcepub fn idf(&self, term: &str) -> f32
pub fn idf(&self, term: &str) -> f32
Get IDF for a term.
Computed lazily from the current (df, N) so it is always consistent
with the live corpus; unknown terms use df = 0 (maximum IDF). Terms
whose IDF falls below min_idf contribute nothing.
Sourcepub fn score<I, T>(
&self,
query_terms: I,
doc_terms: &[T],
doc_len: usize,
) -> f32
pub fn score<I, T>( &self, query_terms: I, doc_terms: &[T], doc_len: usize, ) -> f32
Score a document for a query
Sourcepub fn score_with_tf(
&self,
query_terms: &[String],
doc_tf: &HashMap<String, usize>,
doc_len: usize,
) -> f32
pub fn score_with_tf( &self, query_terms: &[String], doc_tf: &HashMap<String, usize>, doc_len: usize, ) -> f32
Score a document given precomputed term frequencies
Sourcepub fn score_with_tf_u32(
&self,
query_terms: &[String],
doc_tf: &HashMap<String, u32>,
doc_len: usize,
) -> f32
pub fn score_with_tf_u32( &self, query_terms: &[String], doc_tf: &HashMap<String, u32>, doc_len: usize, ) -> f32
Score a document directly from a u32-valued term-frequency map.
Identical math to score_with_tf but lets callers
whose postings already store u32 frequencies (the inverted index) score
without cloning the whole term_freqs map into a usize-valued copy on
every query.
Sourcepub fn add_document<I, T>(&mut self, tokens: I)
pub fn add_document<I, T>(&mut self, tokens: I)
Update stats when adding a document
Sourcepub fn remove_document<'a, I>(&mut self, unique_terms: I, doc_len: usize)where
I: IntoIterator<Item = &'a str>,
pub fn remove_document<'a, I>(&mut self, unique_terms: I, doc_len: usize)where
I: IntoIterator<Item = &'a str>,
Update stats when removing a document.
Inverse of add_document: pass the document’s
unique terms and its token length. Keeps num_docs, total_len, and
doc_freqs consistent so IDF and avgdl never drift under deletion, and
drops terms whose document frequency reaches zero (no vocabulary leak).
Auto Trait Implementations§
impl Freeze for BM25Scorer
impl RefUnwindSafe for BM25Scorer
impl Send for BM25Scorer
impl Sync for BM25Scorer
impl Unpin for BM25Scorer
impl UnsafeUnpin for BM25Scorer
impl UnwindSafe for BM25Scorer
Blanket Implementations§
impl<T> Allocation for T
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more