Skip to main content

Module context

Module context 

Source
Expand description

Conversational-context query enrichment (the query side of retrieval).

A vague follow-up prompt (“now do the other one”, “fix that”) carries little signal on its own, so the bi-encoder retrieves poorly and the cross-encoder has nothing to disambiguate against. The turns before it usually do carry the intent. This module turns a session’s recent-prompt window into two enrichment signals, both gated on how vague the current prompt is (crate::rank::context_weight):

  • a context vector (vector) blended into stage-1 scoring (crate::rank::rank_all_ctx), and
  • an enriched reranker query (rerank_query) — the cross-encoder reads text, not vectors, so the recent window is prepended to the prompt.

Both are inert unless the feature is enabled (context_depth > 0 and context_weight > 0.0), so the default path pays nothing.

Functions§

code_terms
Ecosystem terms implied by code files referenced in text (a prompt and/or recent-window turns), via [ext_terms]. Order-preserving, de-duplicated.
file_ids
Skill ids implied by file references in text (a prompt and/or recent-window turns): scans whitespace-separated tokens for a trailing .<ext> and maps each known extension through [ext_skill]. This is the directly attributable context signal — a file attached or named now is unambiguous in a way a vague prompt is not, so unlike the dense context vector it is not gated on prompt vagueness. De-duplicated.
project_terms
Ecosystem terms implied by the project manifest(s) found in cwd or any ancestor directory (up to [PROJECT_WALK_LEVELS]). Performs cheap exists() stats only; order-preserving and de-duplicated (most specific term first); empty when cwd is empty or no known manifest is found. Resolve against the installed library with skills_for_terms.
rerank_query
The reranker query for a prompt whose best stage-1 self-cosine is prompt_top. The recent-window text is prepended (so the cross-encoder reads the conversation, including any named file) when context applies this turn — either the prompt is vague enough that crate::rank::context_weight is positive, or a file was referenced (file_present) and the file channel is on. Otherwise the bare prompt is returned unchanged, so a confident, file-free prompt is never muddied by stale context.
skills_for_terms
Resolve ecosystem terms against the installed library: every index entry whose keywords (which include its name tokens) or description mention a term maps to that term. Returns skill id → the matched term (for evidence display; the first matching term in terms order wins, so callers should pass the most specific term first). Matching is token-exact after norm_token normalization — “uv” matches a uv keyword or “uv” in the description prose, never a substring — and deliberately generous beyond that: this feeds the ambient channel, which stays cosine-gated in crate::rank::rank_all_ctx, so a spurious term match costs nothing unless the skill was already near-plausible for the prompt.
vector
Build a single context vector from the recent-prompt window: a recency-weighted mean of the per-prompt embeddings (geometric decay, most-recent weight 1.0). None when the feature is off or the window is empty. Embeds the whole window in one batch. The result need not be normalized — crate::rank::cosine normalizes both operands.