Expand description
Agent-facing retrieval: compose structured filters, dense vector similarity, and learned-sparse retrieval under a token budget.
The indexes in crate::index each answer one question in
isolation: “which nodes carry label X”, “which are semantically
close to this embedding”, “which fire on this sparse query”. Real
agents need all three at once and cannot afford to overflow their
LLM context window.
Retriever is the composition layer. It:
- Collects candidate node IDs from each ranker (vector, sparse).
- Fuses ranked lists with Reciprocal Rank Fusion (RRF, k=60).
- Gates fused candidates through label / property filters.
- Renders each surviving node to a compact text form.
- Greedily packs results in RRF-rank order until the caller’s token budget is exhausted (rank-order skip: if a node does not fit, move on; never reorder to exploit slack).
The return value (RetrievalResult) carries both the packed
items and cost metadata (tokens_used, dropped, candidates_seen)
so callers can detect “the budget was tight and we left good stuff
out” without a second round-trip.
§Determinism
All upstream rankers return hits in (score desc, node_id asc)
order, RRF is a pure function of ranks, and rendering is a pure
function of the node. Two independent processes with the same repo
head and the same Retriever configuration produce byte-
identical RetrievalResult instances. This is the property that
lets agent replay and regression tests work.
§Example
let result = repo
.retrieve()
.label("Document")
.vector("openai:text-embedding-3-small", embedding)
.token_budget(2000)
.execute()?;
println!(
"packed {} nodes in {}/{} tokens, {} dropped",
result.items.len(),
result.tokens_used,
result.tokens_budget,
result.dropped,
);
for item in &result.items {
println!("{}", item.rendered);
}Re-exports§
pub use community_filter::CommunityFilterCfg;pub use community_filter::CommunityId;pub use community_filter::CommunityLookup;pub use community_filter::apply_community_filter;pub use fusion::convex_min_max_fusion;pub use fusion::reciprocal_rank_fusion;pub use fusion::score_normalized_fusion;pub use fusion::weighted_reciprocal_rank_fusion;pub use retriever::Retriever;pub use types::FusionStrategy;pub use types::GraphExpand;pub use types::GraphExpandDirection;pub use types::GraphExpandMode;pub use types::Lane;pub use types::RetrievalResult;pub use types::RetrievedItem;pub use types::TemporalFilter;pub use warnings::WARNINGS_CAP;pub use warnings::Warning;pub use warnings::WarningCode;pub use warnings::cap_warnings;
Modules§
- community_
filter - Community expander stage for the retrieval pipeline (experiment E1).
- fusion
- Rank-list fusion functions + the
prefetch_and_filterhelper that threads a candidate set through label / property / temporal gates before ranker scoring. - retriever
Retrieverstruct,Debugimpl, and the builder +executeimplementation.- session_
reservoir - Session-reservoir helper for gap 01 (agent-hop incentive).
- types
- Retrieval result / config types:
Lane,RetrievedItem,RetrievalResult,GraphExpand,GraphExpandDirection,TemporalFilter,FusionStrategy. - warnings
- Gap 14 -
warnings[]structural diagnostics for/v1/retrieve.
Structs§
- Heuristic
Estimator - Byte / character heuristic tuned for modern LLM tokenizers.
Constants§
- DEFAULT_
RENDER_ SUMMARY_ CAP_ CHARS - Character cap applied to
summary+context_sentenceinrender_node. An unbounded summary silently consumed the entire token budget on a single oversized node, producing zero-recall retrieves; capping the render-time string protects the budget packer without losing the underlying data on the node itself.
Traits§
- Token
Estimator - A byte/char counter that approximates an LLM tokenizer.
Functions§
- render_
node - Render a
Nodeto a compact, deterministic text representation suitable for LLM consumption. - render_
node_ with_ adjacency - Like
render_nodebut augments the output with two graph adjacency blocks derived fromrepo’s current commit: