Expand description
Composable ranking layers for search results.
§Why this module exists
Before this refactor, ranking logic was scattered across four call sites with bespoke layer combinations:
| Call site | PageRank | Path penalty | Threshold | TopK |
|---|---|---|---|---|
CLI run_oneshot (indexed) | ❌ | ❌ | inside search() | inside search() |
CLI run_oneshot (stateless) | ❌ | ❌ | inside search() | inside search() |
MCP search_code | ✅ | ❌ | inside search() | inside search() |
| LSP nav/symbols | ✅ (α=0.3 hardcoded) | ❌ | inside search() | inside search() |
RipvecIndex::search | ✅ (optional) | ✅ | inside reranker | inside reranker |
Three concrete bugs landed today because of this scatter: (1)
PageRank silently absent from the CLI; (2) PageRank lookups hit
zero entries due to path-rooting mismatch — the same bug present
in every call site that used boost_with_pagerank before today’s
fix; (3) path penalty regex matched the corpus-root prefix when
invoked from CWD-rooted chunk paths.
The fix: a single RankingLayer trait that each call site
composes into a pipeline. Layers are independently testable, the
pipeline shape at each call site is explicit, and adding a new
ranking signal (e.g., recency, file-saturation diversification)
is a single new impl RankingLayer.
§Convention
Layers operate on Vec<(chunk_idx, score)> with a parallel
&[CodeChunk] for metadata lookup. Layers MAY:
- Mutate scores in place (boost / penalty layers).
- Reorder the vec (sort layers — most boost layers re-sort internally so downstream layers see descending order).
- Drop entries (threshold / topK layers).
When a layer reorders, it MUST leave the vec sorted descending by score so downstream layers (especially threshold + topK) operate on a meaningful ordering.
Structs§
- Cross
Encoder Rerank - Cross-encoder rerank layer.
- Page
Rank Boost - Multiplicative PageRank boost using the sigmoid-on-percentile curve
from
crate::hybrid::pagerank_boost_factor. - Path
Penalty - Multiplicative path-shape penalty for test files, examples, etc.
- Threshold
- Drop items with score below
min_score. Preserves ordering. - TopK
- Truncate to the top
kitems. Caller is responsible for ensuring the list is sorted descending by score before this layer runs; most boost layers re-sort internally so the typical pipeline order isboosts...→Threshold→TopK.
Traits§
- Ranking
Layer - A composable layer in the ranking pipeline.
Functions§
- apply_
chain - Apply a sequence of ranking layers in order.