Skip to main content

rrf_fusion

Function rrf_fusion 

Source
pub fn rrf_fusion(
    dense_results: &[(u64, f32)],
    sparse_results: &[(u64, f32)],
    k: u32,
    top_n: usize,
) -> Vec<FusionResult>
Expand description

Reciprocal Rank Fusion (RRF) algorithm.

Combines two ranked lists into a single ranking using the formula: score(d) = sum(1 / (k + rank_i(d))) for each list i.

§Algorithm

  1. Build a map of document IDs to their ranks in each list
  2. For each unique document, compute RRF score
  3. Sort by descending RRF score
  4. Return top N results

§Arguments

  • dense_results - Results from dense search as (id, score) tuples, ordered by descending score. Scores are not used; only rank matters.
  • sparse_results - Results from sparse search as (id, score) tuples, ordered by descending score. Scores are not used; only rank matters.
  • k - RRF smoothing parameter (default 60). Higher values give more weight to documents that appear lower in the ranking.
  • top_n - Number of results to return.

§Returns

Vec of FusionResult containing fused results sorted by descending RRF score. Includes rank information from both lists.

§Performance

  • Time: O(d + s + u log u) where d = |dense|, s = |sparse|, u = unique IDs
  • Space: O(u) for the score map

§Reference

Cormack, G.V., Clarke, C.L.A., and Buettcher, S. (2009). “Reciprocal Rank Fusion outperforms Condorcet and individual Rank Learning Methods” SIGIR 2009.

§Example

use edgevec::hybrid::rrf_fusion;

// Dense results (ordered by relevance)
let dense = vec![(1, 0.95), (2, 0.80), (3, 0.75)];

// Sparse results (ordered by BM25 score)
let sparse = vec![(2, 5.5), (4, 4.2), (1, 3.8)];

// Fuse with k=60
let results = rrf_fusion(&dense, &sparse, 60, 10);

// ID 2 appears in both lists (rank 2 + rank 1), should score high
// ID 1 appears in both lists (rank 1 + rank 3)
// ID 3 only in dense (rank 3)
// ID 4 only in sparse (rank 2)
assert_eq!(results[0].id, 2); // Best combined rank