Module ranking

Module ranking 

Source

Structs§

PrecomputedBm25Params
Parameters for BM25 calculation with precomputed IDF values
RankingParams
Parameters for document ranking
TfDfResult
Represents the result of term frequency and document frequency computation

Functions§

compute_avgdl
Computes the average document length.
compute_tf_df_from_tokenized
Computes term frequencies (TF) for each document, document frequencies (DF) for each term, and document lengths from pre-tokenized content.
extract_query_terms
Extracts unique terms from a query expression
get_stemmer
Returns a reference to the global stemmer instance
precompute_idfs
Precomputes IDF values for a set of terms
preprocess_text_with_filename
Preprocesses text with filename for search by tokenizing and removing duplicates This is used for filename matching - it adds the filename and its directory structure to the tokens
rank_documents
score_expr_bm25_optimized
Recursively compute a doc’s “ES-like BM25 bool query” score from the AST using precomputed IDF values:
tokenize
Tokenizes text into lowercase words by splitting on whitespace and non-alphanumeric characters, removes stop words, and applies stemming. Also splits camelCase/PascalCase identifiers.

Type Aliases§

QueryTokenMap
Maps unique query tokens (Strings) to unique u8 indices