Skip to main content

Module retrieve

Module retrieve 

Source
Expand description

Agent-facing retrieval: compose structured filters, dense vector similarity, and learned-sparse retrieval under a token budget.

The indexes in crate::index each answer one question in isolation: “which nodes carry label X”, “which are semantically close to this embedding”, “which fire on this sparse query”. Real agents need all three at once and cannot afford to overflow their LLM context window.

Retriever is the composition layer. It:

  1. Collects candidate node IDs from each ranker (vector, sparse).
  2. Fuses ranked lists with Reciprocal Rank Fusion (RRF, k=60).
  3. Gates fused candidates through label / property filters.
  4. Renders each surviving node to a compact text form.
  5. Greedily packs results in RRF-rank order until the caller’s token budget is exhausted (rank-order skip: if a node does not fit, move on; never reorder to exploit slack).

The return value (RetrievalResult) carries both the packed items and cost metadata (tokens_used, dropped, candidates_seen) so callers can detect “the budget was tight and we left good stuff out” without a second round-trip.

§Determinism

All upstream rankers return hits in (score desc, node_id asc) order, RRF is a pure function of ranks, and rendering is a pure function of the node. Two independent processes with the same repo head and the same Retriever configuration produce byte- identical RetrievalResult instances. This is the property that lets agent replay and regression tests work.

§Example

let result = repo
    .retrieve()
    .label("Document")
    .vector("openai:text-embedding-3-small", embedding)
    .token_budget(2000)
    .execute()?;

println!(
    "packed {} nodes in {}/{} tokens, {} dropped",
    result.items.len(),
    result.tokens_used,
    result.tokens_budget,
    result.dropped,
);
for item in &result.items {
    println!("{}", item.rendered);
}

Re-exports§

pub use community_filter::CommunityFilterCfg;
pub use community_filter::CommunityId;
pub use community_filter::CommunityLookup;
pub use community_filter::apply_community_filter;
pub use fusion::convex_min_max_fusion;
pub use fusion::reciprocal_rank_fusion;
pub use fusion::score_normalized_fusion;
pub use fusion::weighted_reciprocal_rank_fusion;
pub use retriever::Retriever;
pub use types::FusionStrategy;
pub use types::GraphExpand;
pub use types::GraphExpandDirection;
pub use types::GraphExpandMode;
pub use types::Lane;
pub use types::RetrievalResult;
pub use types::RetrievedItem;
pub use types::TemporalFilter;
pub use warnings::WARNINGS_CAP;
pub use warnings::Warning;
pub use warnings::WarningCode;
pub use warnings::cap_warnings;

Modules§

community_filter
Community expander stage for the retrieval pipeline (experiment E1).
fusion
Rank-list fusion functions + the prefetch_and_filter helper that threads a candidate set through label / property / temporal gates before ranker scoring.
retriever
Retriever struct, Debug impl, and the builder + execute implementation.
session_reservoir
Session-reservoir helper for gap 01 (agent-hop incentive).
types
Retrieval result / config types: Lane, RetrievedItem, RetrievalResult, GraphExpand, GraphExpandDirection, TemporalFilter, FusionStrategy.
warnings
Gap 14 - warnings[] structural diagnostics for /v1/retrieve.

Structs§

HeuristicEstimator
Byte / character heuristic tuned for modern LLM tokenizers.

Constants§

DEFAULT_RENDER_SUMMARY_CAP_CHARS
Character cap applied to summary + context_sentence in render_node. An unbounded summary silently consumed the entire token budget on a single oversized node, producing zero-recall retrieves; capping the render-time string protects the budget packer without losing the underlying data on the node itself.

Traits§

TokenEstimator
A byte/char counter that approximates an LLM tokenizer.

Functions§

render_node
Render a Node to a compact, deterministic text representation suitable for LLM consumption.
render_node_with_adjacency
Like render_node but augments the output with two graph adjacency blocks derived from repo’s current commit: