Module confidence

Expand description

v0.7.0 Form 5 (issue #758) — auto-confidence + shadow-mode + freshness-decay + calibration tooling. Closes the FORM 5 PARTIAL audit finding by adding deterministic auto-derivation, opt-in shadow-mode telemetry, half-life-driven freshness decay, and a per-source baseline calibration sweep on top of the legacy caller-provided confidence field. v0.7.0 Form 5 — auto-confidence + shadow-mode + calibration tooling (issue #758).

The Batman 6-form audit (PR #753, docs/internal/batman-framework-audit.md) found Form 5 PARTIAL: the memories.confidence REAL column had existed since schema v2 and recall ranking consumed it (+ confidence * 2.0 in the FTS5 score expression at src/storage/mod.rs), but the surrounding pipeline was missing the four substrate-honesty surfaces a “first-class confidence” claim requires:

Automatic assignment. Every caller value was taken at face; no source-age decay, atom-derivation bump, or prior-corroboration boost ever rewrote it.
Shadow-mode telemetry. No mechanism existed to compare a caller-provided value against a derived one on a live workload.
Calibration. No per-namespace / per-source-role baseline was ever computed from observed samples.
Freshness decay. An old fact at confidence=0.9 ranked identically to a fresh fact at the same value, despite human memory and downstream LLM reasoning both treating recency as a trust signal.

This module is the Rust-side closeout. The schema half lives in migrations/sqlite/0033_v07_form5_confidence_calibration.sql and migrations/postgres/0020_v07_form5_confidence_calibration.sql.

§Surface

derive — deterministic auto-derivation from row signals. Opt-in via AI_MEMORY_AUTO_CONFIDENCE=1.
[shadow::observe] — writes per-recall samples to confidence_shadow_observations when AI_MEMORY_CONFIDENCE_SHADOW=1. Audit-honest: the caller value is still the one used downstream; shadow never silently overrides.
[decay::decayed] — exponential freshness decay (exp(-age / half_life)); operator opts in with AI_MEMORY_CONFIDENCE_DECAY=1 or per-namespace confidence_decay_half_life_days policy.
[calibrate::calibrate_from_shadow] — computes per-source baselines from the shadow-observations table. Driven by the ai-memory calibrate confidence CLI and the memory_calibrate_confidence MCP tool.

Modules§

calibrate: v0.7.0 Form 5 — calibration sweep.
decay: v0.7.0 Form 5 — freshness-decay updater.
shadow: v0.7.0 Form 5 — shadow-mode telemetry pipeline.

Structs§

DeriveContext: Context the [derive] engine consults at the moment it computes a fresh confidence value.

Constants§

DEFAULT_HALF_LIFE_DAYS: Default half-life (in days) for the freshness-decay envelope. 30 days mirrors a working agent’s “this month vs. last month” salience window; long-tier rows that survive a month already have meaningful corroboration through the access_count promotion loop, so the half-life acts as a soft-floor rather than a hard expiry.
ENV_AUTO_CONFIDENCE: Environment-variable opt-in for the auto-derive engine. When unset or any value other than "1", [derive] returns the caller’s confidence verbatim — preserving the v0.6.x contract.

Functions§

auto_confidence_enabled: Returns true when ENV_AUTO_CONFIDENCE is set to "1". Centralised so the recall path, store path, and tests all read the same flag.
derive: Deterministically derive a confidence value from row signals.