Skip to main content

Module search_reranking

Module search_reranking 

Source
Expand description

Post-RRF Reranking Pipeline for code-aware search.

Scientific foundations:

  • Cormack et al. (SIGIR 2009): RRF as unsupervised fusion baseline
  • Carbonell & Goldstein (SIGIR 1998): MMR diversity via file-saturation decay
  • CoRNStack (ICLR 2025): Definition-boost + noise filtering for code
  • SACL (EMNLP 2025): Query-type-adaptive weighting + path enrichment
  • SweRank (2025): Multi-stage retrieve-then-rerank for code localization

Pipeline order (applied after RRF fusion):

  1. Definition Boost — chunks defining the queried symbol rank higher
  2. File Coherence — files with multiple relevant chunks get boosted
  3. Noise Penalties — test/legacy/compat paths get penalized
  4. MMR Diversity — exponential decay per file prevents single-file dominance

Enums§

QueryType

Functions§

classify_query
Classify a search query as Symbol, NL, or Architecture.
rerank_pipeline
Apply the full post-RRF reranking pipeline.
resolve_weights
Resolve BM25 vs Dense weight based on query type. Returns (bm25_weight, dense_weight).