Skip to main content

Module ranking

Module ranking 

Source
Expand description

Composable ranking layers for search results.

§Why this module exists

Before this refactor, ranking logic was scattered across four call sites with bespoke layer combinations:

Call sitePageRankPath penaltyThresholdTopK
CLI run_oneshot (indexed)inside search()inside search()
CLI run_oneshot (stateless)inside search()inside search()
MCP search_codeinside search()inside search()
LSP nav/symbols✅ (α=0.3 hardcoded)inside search()inside search()
RipvecIndex::search✅ (optional)inside rerankerinside reranker

Three concrete bugs landed today because of this scatter: (1) PageRank silently absent from the CLI; (2) PageRank lookups hit zero entries due to path-rooting mismatch — the same bug present in every call site that used boost_with_pagerank before today’s fix; (3) path penalty regex matched the corpus-root prefix when invoked from CWD-rooted chunk paths.

The fix: a single RankingLayer trait that each call site composes into a pipeline. Layers are independently testable, the pipeline shape at each call site is explicit, and adding a new ranking signal (e.g., recency, file-saturation diversification) is a single new impl RankingLayer.

§Convention

Layers operate on Vec<(chunk_idx, score)> with a parallel &[CodeChunk] for metadata lookup. Layers MAY:

  • Mutate scores in place (boost / penalty layers).
  • Reorder the vec (sort layers — most boost layers re-sort internally so downstream layers see descending order).
  • Drop entries (threshold / topK layers).

When a layer reorders, it MUST leave the vec sorted descending by score so downstream layers (especially threshold + topK) operate on a meaningful ordering.

Structs§

CrossEncoderRerank
Cross-encoder rerank layer.
PageRankBoost
Multiplicative PageRank boost using the sigmoid-on-percentile curve from crate::hybrid::pagerank_boost_factor.
PathPenalty
Multiplicative path-shape penalty for test files, examples, etc.
Threshold
Drop items with score below min_score. Preserves ordering.
TopK
Truncate to the top k items. Caller is responsible for ensuring the list is sorted descending by score before this layer runs; most boost layers re-sort internally so the typical pipeline order is boosts...ThresholdTopK.

Traits§

RankingLayer
A composable layer in the ranking pipeline.

Functions§

apply_chain
Apply a sequence of ranking layers in order.