Struct EnrichmentEffectiveness

Source

pub struct EnrichmentEffectiveness {Show 14 fields
    pub total_prefetches: u32,
    pub cited_prefetches: u32,
    pub total_declines: u32,
    pub late_invoked_after_decline: u32,
    pub cost_overrun_count: u32,
    pub total_predictions: u32,
    pub net_prediction_error_tokens: i64,
    pub inference_calls_saved_prefetch: u32,
    pub inference_calls_saved_dedup: u32,
    pub inference_calls_saved_fail_fast: u32,
    pub inference_tokens_saved: u64,
    pub prefetch_dispatched: u32,
    pub prefetch_won_race: u32,
    pub prefetch_wasted: u32,
}

Expand description

Aggregate scoring of how well the Paper 3 enrichment planner served the agent during a session. Populated by the live pipeline (counters) plus the offline post-pass (cited_* numbers, see P-3-08).

Three primary rates the operator reads:

Prefetch hit rate — fraction of planner-prefetched calls whose content was textually cited by the LLM in the next 1–3 turns. The north-star efficiency number; target ≥ 60%.
Decline recall loss — fraction of declined candidates the LLM ended up calling itself within the next 5 turns. Higher means the planner is too greedy. Target ≤ 10%.
Cost overrun rate — fraction of admitted calls whose actual tokens_baseline exceeded the predicted cost by ≥ 30%. Drives refresh of cost_model.typical_kb priors. Target ≤ 15%.

And the operator-facing ROI counters:

inference_calls_saved_* — number of LLM round-trips the planner short-circuited, broken into three buckets so the contribution of each mechanism stays visible: prefetch (cited speculative calls), dedup (Paper 2 L0 hits — tool body replaced with a near-ref hint so the LLM never sees the full payload), and fail_fast (e.g. ToolSearch self-loop blocked after fail_fast_after_n).
inference_tokens_saved — sum of tokens_baseline from those short-circuited calls. The headline “we saved this much context” number for tune analyze.

Token savings vs a no-planner baseline is the roll-up “did the enricher pay for itself” answer; it lives in the corpus-replay validation harness (Paper 3 §Validation strategy), not on this summary, because it requires running the same session both with and without the planner. This struct carries only the per-session counters that drive the three rates above.

Fields§

§total_prefetches: u32

Number of calls the planner pre-fetched.

§cited_prefetches: u32

Of total_prefetches, the count whose content was cited by the LLM in the next 1–3 turns. Filled in by the offline post-pass; stays 0 until the post-pass has run.

§total_declines: u32

Number of candidates the planner declined for any reason.

§late_invoked_after_decline: u32

Of total_declines, the count where the LLM later issued the declined tool itself within the next 5 turns. Lower-is-better.

§cost_overrun_count: u32

Number of admitted calls whose actual tokens_baseline exceeded the planner’s prediction by ≥ 30%.

§total_predictions: u32

Total admitted calls (denominator for cost_overrun_rate).

§net_prediction_error_tokens: i64

Sum of predicted-vs-actual prediction error in tokens — useful for diagnosing systematic under- or over-estimation.

§inference_calls_saved_prefetch: u32

LLM tool-uses avoided because the planner pre-fetched the content and the model cited it in the next 1–3 turns. Counted only when PipelineEvent::cited_in_next_n_turns is Some(true).

§inference_calls_saved_dedup: u32

LLM tool-uses avoided because L0 dedup replaced the response with a near-ref hint. Counted on every event with is_dedup_hit = true.

§inference_calls_saved_fail_fast: u32

LLM tool-uses avoided because crate::enrichment short- circuited a fail_fast_after_n loop (e.g. ToolSearch returning 0 bytes twice in a row). Incremented from the planner side via Self::record_fail_fast_skip.

§inference_tokens_saved: u64

Sum of baseline tokens from all three saved-call buckets. The “we saved this much context” headline for tune analyze.

§prefetch_dispatched: u32

Number of speculative tool-calls the host actually dispatched out-of-band (a subset of total_prefetches: the fraction the host successfully scheduled, not just plans the planner produced).

§prefetch_won_race: u32

Of prefetch_dispatched, the count where the prefetch result landed in the dedup cache before the LLM asked for the same tool, so the LLM’s call collapsed to an L0 hit. The other axis of “did the speculation pay off” — independent of textual citation.

§prefetch_wasted: u32

Prefetches the LLM never asked for in the same session. Wasted API quota / dollars; high values trigger R7’s per-tool auto-disable in tune analyze.

Struct EnrichmentEffectiveness Copy item path

Fields§

Implementations§

impl EnrichmentEffectiveness

pub fn prefetch_hit_rate(&self) -> Option<f32>

pub fn decline_recall_loss(&self) -> Option<f32>

pub fn cost_overrun_rate(&self) -> Option<f32>

pub fn total_calls_saved(&self) -> u32

pub fn accumulate(&mut self, ev: &PipelineEvent)

pub fn record_fail_fast_skip(&mut self, predicted_cost_tokens: u32)

pub fn record_prefetch_dispatched(&mut self)

pub fn record_prefetch_won_race(&mut self)

pub fn record_prefetch_wasted(&mut self)

pub fn prefetch_race_win_rate(&self) -> Option<f32>

pub fn prefetch_waste_rate(&self) -> Option<f32>

pub fn report(&self) -> String

Trait Implementations§

impl Clone for EnrichmentEffectiveness

fn clone(&self) -> EnrichmentEffectiveness

fn clone_from(&mut self, source: &Self)

impl Debug for EnrichmentEffectiveness

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Default for EnrichmentEffectiveness

fn default() -> EnrichmentEffectiveness

impl<'de> Deserialize<'de> for EnrichmentEffectiveness

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where __D: Deserializer<'de>,

impl PartialEq for EnrichmentEffectiveness

fn eq(&self, other: &EnrichmentEffectiveness) -> bool

fn ne(&self, other: &Rhs) -> bool

impl Serialize for EnrichmentEffectiveness

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>where __S: Serializer,

impl StructuralPartialEq for EnrichmentEffectiveness

Auto Trait Implementations§

impl Freeze for EnrichmentEffectiveness

impl RefUnwindSafe for EnrichmentEffectiveness

impl Send for EnrichmentEffectiveness

impl Sync for EnrichmentEffectiveness

impl Unpin for EnrichmentEffectiveness

impl UnsafeUnpin for EnrichmentEffectiveness

impl UnwindSafe for EnrichmentEffectiveness

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T> Instrument for T

fn instrument(self, span: Span) -> Instrumented<Self>

fn in_current_span(self) -> Instrumented<Self>

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> PolicyExt for Twhere T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>where T: Policy<B, E>, P: Policy<B, E>,

fn or<P, B, E>(self, other: P) -> Or<T, P>where T: Policy<B, E>, P: Policy<B, E>,

impl<T> Same for T

type Output = T

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<T> WithSubscriber for T

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>where S: Into<Dispatch>,

fn with_current_subscriber(self) -> WithDispatch<Self>

impl<T> DeserializeOwned for Twhere T: for<'de> Deserialize<'de>,

Struct EnrichmentEffectiveness

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where __D: Deserializer<'de>,

fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> PolicyExt for T
where T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,