Module ranking

Structs§

PrecomputedBm25Params: Parameters for BM25 calculation with precomputed IDF values
RankingParams: Parameters for document ranking
TfDfResult: Represents the result of term frequency and document frequency computation

compute_avgdl: Computes the average document length.
compute_tf_df_from_tokenized: Computes term frequencies (TF) for each document, document frequencies (DF) for each term, and document lengths from pre-tokenized content.
extract_query_terms: Extracts unique terms from a query expression
get_stemmer: Returns a reference to the global stemmer instance
precompute_idfs: Precomputes IDF values for a set of terms
preprocess_text_with_filename: Preprocesses text with filename for search by tokenizing and removing duplicates This is used for filename matching - it adds the filename and its directory structure to the tokens
rank_documents
score_expr_bm25_optimized: Recursively compute a doc’s “ES-like BM25 bool query” score from the AST using precomputed IDF values:
tokenize: Tokenizes text into lowercase words by splitting on whitespace and non-alphanumeric characters, removes stop words, and applies stemming. Also splits camelCase/PascalCase identifiers.