Module calibration

Expand description

v0.34: Calibration scoring over resolved predictions.

A Prediction carries the predictor’s confidence in the expected outcome (a number on [0, 1]). When a Resolution records what actually happened, the resolver also records matched_expected (a bool). Together those two facts let us compute, per actor:

Hit rate: fraction of resolved predictions that matched.
Brier score: mean of (confidence - matched)^2 across the resolved subset, where matched ∈ {0, 1}. Lower is better. Brier = 0 means perfect calibration; 0.25 is a chance-level binary predictor; 1.0 is maximally wrong.
Log score: mean of log(p_assigned_to_actual_outcome). We clip to [1e-9, 1 - 1e-9] to avoid -∞. Higher (closer to 0) is better.

These are derived signals — never written to disk, always recomputed from the canonical predictions and resolutions collections. That keeps the kernel ledger source-of-truth and avoids stale calibration cache concerns.

Calibration is the move that makes Vela an epistemic ledger rather than a knowledge graph: every actor accumulates a public, reproducible track record of how well their stated beliefs match reality.

Structs§

CalibrationRecord: Per-actor calibration summary computed over the resolved subset of the actor’s predictions.
ExpirationReport: v0.40.1: report from one expiration pass.

Functions§

calibration_for_actor: Convenience: calibration for a single actor.
calibration_records: Compute calibration records for every actor that has at least one prediction in the frontier.
expire_overdue_predictions: v0.40.1: walk every prediction in the project and mark as expired_unresolved any whose resolves_by is in the past and has no associated Resolution. Emits one prediction.expired_unresolved event per newly-expired prediction.