pub fn rrf<I: Clone + Eq + Hash>(
results_a: &[(I, f32)],
results_b: &[(I, f32)],
) -> Vec<(I, f32)>Expand description
Reciprocal Rank Fusion of two result lists with default config (k=60).
Formula: score(d) = Σ 1/(k + rank) where rank is 0-indexed.
Why RRF? Different retrievers use incompatible score scales (BM25: 0-100, dense: 0-1). RRF solves this by ignoring scores entirely and using only rank positions. The reciprocal formula ensures:
- Top positions dominate (rank 0 gets 1/60 = 0.017, rank 5 gets 1/65 = 0.015)
- Multiple list agreement is rewarded (documents appearing in both lists score higher)
- No normalization needed (works with any score distribution)
When to use: Hybrid search with incompatible score scales, zero-configuration needs. When NOT to use: When score scales are compatible, CombSUM achieves ~3-4% better NDCG.
Use rrf_with_config to customize the k parameter (lower k = more top-heavy).
§Duplicate Document IDs
If a document ID appears multiple times in the same list, all occurrences contribute
to the RRF score based on their respective ranks. For example, if “d1” appears at
rank 0 and rank 5 in list A, its contribution from list A is 1/(k+0) + 1/(k+5).
This differs from some implementations that take only the first occurrence.
§Complexity
O(n log n) where n = |a| + |b| (dominated by final sort).
§Input Validation
This function does not validate inputs. For validation, use validate()
after fusion. Edge cases handled:
- Empty lists: Returns items from non-empty list(s)
- k=0: Returns empty Vec (use
validate()to catch this) - Non-finite scores: Ignored (RRF is rank-based)
§Example
use rankops::rrf;
let sparse = vec![("d1", 0.9), ("d2", 0.5)];
let dense = vec![("d2", 0.8), ("d3", 0.3)];
let fused = rrf(&sparse, &dense);
assert_eq!(fused[0].0, "d2"); // appears in both lists (consensus)Reciprocal Rank Fusion (RRF) with default configuration (k=60).
RRF is the recommended fusion method when combining rankings with incompatible score scales. It uses only rank positions, ignoring score magnitudes entirely.
§Formula
RRF(d) = Σ 1/(k + rank_r(d)) where:
k= smoothing constant (default: 60)rank_r(d)= position of document d in ranking r (0-indexed)
§Why RRF?
- Score Scale Independent: Works with any scoring system (BM25: 0-100, embeddings: 0-1)
- Robust: Handles missing documents gracefully (documents not in a list contribute 0)
- Effective: Proven to outperform individual rankers in hybrid search
- Fast: O(n log n) complexity where n = total unique documents
§Arguments
results_a- First ranked list:Vec<(document_id, score)>results_b- Second ranked list:Vec<(document_id, score)>
Note: Scores are ignored; only rank positions matter.
§Returns
Fused ranking sorted by RRF score (descending). Documents appearing in both lists rank higher than those appearing in only one list.
§Example
use rankops::rrf;
// BM25 results (high scores = better)
let bm25 = vec![
("doc1", 12.5),
("doc2", 11.0),
("doc3", 10.0),
];
// Dense embedding results (different scale: 0-1)
let dense = vec![
("doc2", 0.9),
("doc3", 0.8),
("doc1", 0.7),
];
// RRF ignores scores, uses only rank positions
let fused = rrf(&bm25, &dense);
// doc2 ranks highest (appears in both lists at high positions)
// doc1 and doc3 follow (appear in both lists but at different positions)§Performance
Time complexity: O(n log n) where n = |results_a| + |results_b| (dominated by final sort). For typical workloads (100-1000 items per list), fusion completes in <1ms.
§When to Use
- ✅ Combining BM25 + dense embeddings (different scales)
- ✅ Combining sparse + dense retrieval
- ✅ Unknown or incompatible score scales
- ✅ Need robust fusion without normalization tuning
§When NOT to Use
- ❌ Scores are already normalized and comparable (use
combsumorweighted) - ❌ You trust score magnitudes (use score-based fusion)
- ❌ Need fine-grained control over retriever importance (use
rrf_weighted)