Skip to main content

Crate rankops

Crate rankops 

Source
Expand description

Operations on ranked lists: fuse multiple retrievers, then rerank.

Pairs with rankfns (scoring kernels). Combine results from multiple retrievers (BM25, dense, sparse) and rerank with MaxSim (ColBERT), diversity (MMR/DPP), or Matryoshka.

use rankops::rrf;

let bm25 = vec![("d1", 12.5), ("d2", 11.0)];
let dense = vec![("d2", 0.9), ("d3", 0.8)];
let fused = rrf(&bm25, &dense);
// d2 ranks highest (appears in both lists)

§Fusion Algorithms

FunctionUses ScoresBest For
rrfNoIncompatible score scales
isrNoWhen lower ranks matter more
combsumYesSimilar scales, trust scores
combmnzYesReward overlap between lists
bordaNoSimple voting
weightedYesCustom retriever weights
dbsfYesDifferent score distributions
condorcetNoPairwise voting, outlier-robust
copelandNoNet pairwise wins, more discriminative than Condorcet
median_rankNoMedian rank across lists, outlier-robust
combmaxYesAt least one retriever likes it
combminYesAll retrievers must agree (conservative)
combmedYesMedian score, robust to outliers

All have *_multi variants for 3+ lists.

§Diversity Reranking

FunctionDescription
mmrMaximal Marginal Relevance (Carbonell & Goldstein, 1998)
mmr_with_matrixMMR with precomputed similarity matrix
mmr_embeddingsMMR with embedding vectors (computes cosine similarity)

MMR balances relevance and diversity via tunable λ parameter.

§Performance Notes

OpenSearch benchmarks (BEIR) show RRF is ~3-4% lower NDCG than score-based fusion (CombSUM), but ~1-2% faster. RRF excels when score scales are incompatible or unknown. See OpenSearch RRF blog.

Re-exports§

pub use validate::validate;
pub use validate::validate_bounds;
pub use validate::validate_finite_scores;
pub use validate::validate_no_duplicates;
pub use validate::validate_non_negative_scores;
pub use validate::validate_sorted;
pub use validate::ValidationResult;

Modules§

adapt
Adapters for converting retriever outputs (distances, similarities, logits). Adapters for converting retriever outputs into rankops input format.
diagnostics
Fusion diagnostics: complementarity, overlap, score distributions. Fusion diagnostics: decide whether to fuse, and which method to use.
dp_topk
Differentiable top-k selection via smooth semiring DP. Differentiable top-k selection via dynamic programming on a smooth semiring.
explain
Explainability module for debugging and analysis.
metrics
Ranking evaluation metrics: MRR, NDCG, Hits@k, Precision@k, Recall@k, and more. Ranking evaluation metrics.
optimize
Optimization module exports.
pipeline
Composable fusion pipeline and multi-query fusion. Composable fusion pipeline: normalize, fuse, rerank, evaluate.
prelude
Prelude for common imports.
rerank
Reranking: MaxSim/ColBERT, MMR/DPP diversity, Matryoshka, scoring, quantization. Reranking: MaxSim (ColBERT), cosine similarity, diversity (MMR/DPP), matryoshka.
validate
Validation utilities for fusion results. Validation utilities for fusion results.

Structs§

AdditiveMultiTaskConfig
Configuration for additive multi-task fusion.
ConsensusReport
Analyze consensus patterns across retrievers.
Explanation
Explanation of how a fused score was computed.
FusedResult
A fused result with full provenance information for debugging and analysis.
FusionConfig
Configuration for rank-based fusion (Borda, CombSUM, CombMNZ).
MmrConfig
Maximal Marginal Relevance (MMR) configuration.
OptimizeConfig
Optimization configuration for hyperparameter search.
OptimizedParams
Optimized parameters from hyperparameter search.
RetrieverId
Retriever identifier for explainability.
RetrieverStats
Attribution statistics for each retriever.
RrfConfig
RRF configuration.
SourceContribution
Contribution from a single retriever to a document’s final score.
StandardizedConfig
Configuration for standardization-based fusion.
WeightedConfig
Weighted fusion configuration.

Enums§

FusionError
Errors that can occur during fusion.
FusionMethod
Unified fusion method for dispatching to different algorithms.
Normalization
Score normalization methods.
OptimizeMetric
Metric to optimize during hyperparameter search.
ParamGrid
Parameter grid for optimization.

Functions§

additive_multi_task
Additive multi-task fusion (ResFlow-style).
additive_multi_task_multi
Additive multi-task fusion for 3+ weighted lists.
additive_multi_task_with_config
Additive multi-task fusion with configuration.
analyze_consensus
Analyze consensus across fused results, identifying high-agreement and single-source items.
attribute_top_k
Attribute top-k results to retrievers.
borda
Borda count voting — position-based scoring.
borda_multi
Borda count for 3+ result lists.
borda_with_config
Borda count with configuration.
combanz
CombANZ: average of non-zero scores.
combanz_multi
CombANZ for 3+ result lists.
combmax
CombMAX: maximum score across all lists.
combmax_multi
CombMAX for 3+ result lists.
combmed
CombMED: median score across all lists.
combmed_multi
CombMED for 3+ result lists.
combmin
CombMIN: minimum score across all lists.
combmin_multi
CombMIN for 3+ result lists.
combmnz
Normalized sum × overlap count (CombMNZ).
combmnz_explain
CombMNZ with explainability.
combmnz_multi
CombMNZ for 3+ result lists.
combmnz_with_config
CombMNZ with configuration.
combsum
Sum of min-max normalized scores (CombSUM).
combsum_explain
CombSUM with explainability.
combsum_multi
CombSUM for 3+ result lists.
combsum_with_config
CombSUM with configuration.
condorcet
Condorcet fusion (pairwise comparison voting).
condorcet_multi
Condorcet for 3+ result lists.
copeland
Copeland fusion – pairwise net wins across ranked lists.
copeland_multi
Copeland fusion for 3+ result lists.
dbsf
Distribution-Based Score Fusion (DBSF).
dbsf_explain
DBSF with explainability.
dbsf_multi
DBSF for 3+ result lists.
dbsf_with_config
DBSF with configuration.
evaluate_metric
Evaluate a ranked list using the specified metric.
hit_rate
Hit Rate (Success@k).
isr
Inverse Square Root rank fusion with default config (k=1).
isr_multi
ISR for 3+ result lists.
isr_with_config
ISR with custom configuration.
map
Mean Average Precision (MAP).
map_at_k
Mean Average Precision at k (MAP@k).
median_rank
Median Rank Aggregation.
median_rank_multi
Median Rank Aggregation for 3+ result lists.
mmr
Maximal Marginal Relevance reranking.
mmr_embeddings
MMR for embedding-based retrieval.
mmr_with_matrix
MMR with precomputed similarity matrix.
mrr
Mean Reciprocal Rank.
ndcg_at_k
Normalized Discounted Cumulative Gain at k.
normalize_scores
Normalize a list of scores using the specified method.
optimize_fusion
Optimize fusion hyperparameters using grid search.
precision_at_k
Precision at k.
rbc
Rank-Biased Centroids (RBC) fusion.
rbc_multi
RBC for 3+ result lists with custom persistence.
recall_at_k
Recall at k.
rrf
Reciprocal Rank Fusion (RRF) with default configuration (k=60).
rrf_explain
RRF with explainability: returns full provenance for each result.
rrf_multi
RRF for 3+ result lists.
rrf_weighted
Weighted RRF: per-retriever weights applied to rank-based scores.
rrf_with_config
RRF with custom configuration.
standardized
Standardization-based fusion (ERANK-style).
standardized_multi
Standardized fusion for 3+ result lists.
standardized_with_config
Standardized fusion with configuration.
weighted
Weighted score fusion with configurable retriever trust.
weighted_multi
Weighted fusion for 3+ result lists.

Type Aliases§

Qrels
Relevance judgments (qrels) for a query.
Result
Result type for fusion operations.