pub fn rrf_fusion(
dense_results: &[(u64, f32)],
sparse_results: &[(u64, f32)],
k: u32,
top_n: usize,
) -> Vec<FusionResult>Expand description
Reciprocal Rank Fusion (RRF) algorithm.
Combines two ranked lists into a single ranking using the formula:
score(d) = sum(1 / (k + rank_i(d))) for each list i.
§Algorithm
- Build a map of document IDs to their ranks in each list
- For each unique document, compute RRF score
- Sort by descending RRF score
- Return top N results
§Arguments
dense_results- Results from dense search as (id, score) tuples, ordered by descending score. Scores are not used; only rank matters.sparse_results- Results from sparse search as (id, score) tuples, ordered by descending score. Scores are not used; only rank matters.k- RRF smoothing parameter (default 60). Higher values give more weight to documents that appear lower in the ranking.top_n- Number of results to return.
§Returns
Vec of FusionResult containing fused results sorted by descending
RRF score. Includes rank information from both lists.
§Performance
- Time: O(d + s + u log u) where d = |dense|, s = |sparse|, u = unique IDs
- Space: O(u) for the score map
§Reference
Cormack, G.V., Clarke, C.L.A., and Buettcher, S. (2009). “Reciprocal Rank Fusion outperforms Condorcet and individual Rank Learning Methods” SIGIR 2009.
§Example
use edgevec::hybrid::rrf_fusion;
// Dense results (ordered by relevance)
let dense = vec![(1, 0.95), (2, 0.80), (3, 0.75)];
// Sparse results (ordered by BM25 score)
let sparse = vec![(2, 5.5), (4, 4.2), (1, 3.8)];
// Fuse with k=60
let results = rrf_fusion(&dense, &sparse, 60, 10);
// ID 2 appears in both lists (rank 2 + rank 1), should score high
// ID 1 appears in both lists (rank 1 + rank 3)
// ID 3 only in dense (rank 3)
// ID 4 only in sparse (rank 2)
assert_eq!(results[0].id, 2); // Best combined rank