Skip to main content

Module speculation

Module speculation 

Source
Expand description

Paper 3 — speculative tool-call dispatcher.

Given an EnrichmentPlan from the planner, spawns each prefetch as a Tokio task, caps concurrency per rate_limit_host (so two providers hitting the same domain share the budget), waits up to prefetch_timeout_ms for results to land, and aborts everything still pending on session shutdown.

§Design

  • Dispatcher traitPrefetchDispatcher abstracts the actual tools/call path so tests can plug in a mock without pulling in MCP transport. The real impl wraps the server’s handler and is wired in SessionPipeline.
  • Per-host concurrency cap — a Mutex<HashMap<host, in_flight>> tracks in-flight prefetches per rate-limit host; the dispatcher refuses to schedule a call when the cap is hit. None host = unlimited (local tool).
  • Bounded synchronous waitSpeculationEngine::wait_within blocks at most prefetch_timeout_ms collecting results that landed in time; anything still pending keeps running in the background and lands later via the dedup cache.
  • Cascade cancellationSpeculationEngine::shutdown (also called from Drop) aborts every pending task. No orphan IO.

Telemetry counters (prefetch_dispatched, prefetch_won_race, prefetch_wasted) are updated by the caller; this module only reports the outcomes.

Structs§

HostBudget
Per-host in-flight counter. Cheap clone (Arc).
PrefetchRequest
One unit of work the engine decides about. Public so the host can produce the list (combining EnrichmentPlan.calls with extracted args from projection) before handing it to the engine.
SpeculationEngine
Per-turn speculation engine. One instance per SessionPipeline; holds the JoinSet and the host budget. Drop = shutdown.

Enums§

PrefetchError
PrefetchOutcome
Outcome of a single prefetch task as observed by SpeculationEngine::wait_within.
SkipReason

Traits§

PrefetchDispatcher
Abstracts how a prefetch is actually executed. The real impl wraps the MCP server’s tools/call handler; the test impl returns a canned body or an error after an optional sleep.