Expand description
Post-RRF Reranking Pipeline for code-aware search.
Scientific foundations:
- Cormack et al. (SIGIR 2009): RRF as unsupervised fusion baseline
- Carbonell & Goldstein (SIGIR 1998): MMR diversity via file-saturation decay
- CoRNStack (ICLR 2025): Definition-boost + noise filtering for code
- SACL (EMNLP 2025): Query-type-adaptive weighting + path enrichment
- SweRank (2025): Multi-stage retrieve-then-rerank for code localization
Pipeline order (applied after RRF fusion):
- Definition Boost — chunks defining the queried symbol rank higher
- File Coherence — files with multiple relevant chunks get boosted
- Noise Penalties — test/legacy/compat paths get penalized
- MMR Diversity — exponential decay per file prevents single-file dominance
Enums§
Functions§
- classify_
query - Classify a search query as Symbol, NL, or Architecture.
- rerank_
pipeline - Apply the full post-RRF reranking pipeline.
- resolve_
weights - Resolve BM25 vs Dense weight based on query type. Returns (bm25_weight, dense_weight).