pub struct WandData {
pub total_docs: u64,
pub total_tokens: u64,
pub avg_doc_len: f32,
pub bm25_k1: f32,
pub bm25_b: f32,
/* private fields */
}Expand description
Collection-level WAND data
Contains pre-computed statistics needed for efficient WAND query processing.
This data is typically computed offline using hermes-tool term-stats and
loaded at index open time.
Fields§
§total_docs: u64Total number of documents in the collection
total_tokens: u64Total number of tokens across all documents
avg_doc_len: f32Average document length (tokens per document)
bm25_k1: f32BM25 k1 parameter used for computing upper bounds
bm25_b: f32BM25 b parameter used for computing upper bounds
Implementations§
Source§impl WandData
impl WandData
Sourcepub fn from_json_file<P: AsRef<Path>>(path: P) -> Result<Self>
pub fn from_json_file<P: AsRef<Path>>(path: P) -> Result<Self>
Load WAND data from a JSON file
Sourcepub fn from_json_reader<R: Read>(reader: R) -> Result<Self>
pub fn from_json_reader<R: Read>(reader: R) -> Result<Self>
Load WAND data from a JSON reader
Sourcepub fn from_json_bytes(bytes: &[u8]) -> Result<Self>
pub fn from_json_bytes(bytes: &[u8]) -> Result<Self>
Load WAND data from JSON bytes
Sourcepub fn to_json_file<P: AsRef<Path>>(&self, path: P) -> Result<()>
pub fn to_json_file<P: AsRef<Path>>(&self, path: P) -> Result<()>
Save WAND data to a JSON file
Sourcepub fn to_json_writer<W: Write>(&self, writer: W) -> Result<()>
pub fn to_json_writer<W: Write>(&self, writer: W) -> Result<()>
Write WAND data to a JSON writer
Sourcepub fn get_idf(&self, field: &str, term: &str) -> Option<f32>
pub fn get_idf(&self, field: &str, term: &str) -> Option<f32>
Get IDF for a term in a field
Returns None if the term is not found in the pre-computed data. In that case, you should compute IDF on-the-fly using the segment’s document count and term document frequency.
Sourcepub fn get_term_info(&self, field: &str, term: &str) -> Option<&TermWandInfo>
pub fn get_term_info(&self, field: &str, term: &str) -> Option<&TermWandInfo>
Get full term info for a term in a field
Sourcepub fn get_upper_bound(&self, field: &str, term: &str) -> Option<f32>
pub fn get_upper_bound(&self, field: &str, term: &str) -> Option<f32>
Get upper bound score for a term
Sourcepub fn compute_idf(&self, df: u32) -> f32
pub fn compute_idf(&self, df: u32) -> f32
Compute IDF for a term given its document frequency
Uses the BM25 IDF formula: log((N - df + 0.5) / (df + 0.5))
Sourcepub fn compute_upper_bound(&self, max_tf: u32, idf: f32) -> f32
pub fn compute_upper_bound(&self, max_tf: u32, idf: f32) -> f32
Compute upper bound score for a term given max_tf and IDF
Uses conservative length normalization (assumes shortest possible document)
Trait Implementations§
Source§impl<'de> Deserialize<'de> for WandData
impl<'de> Deserialize<'de> for WandData
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Auto Trait Implementations§
impl Freeze for WandData
impl RefUnwindSafe for WandData
impl Send for WandData
impl Sync for WandData
impl Unpin for WandData
impl UnwindSafe for WandData
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more