Skip to main content

Module calibration

Module calibration 

Source
Expand description

C3 (#158) — adversarial calibration loop.

Closed-loop calibration that drives the synthetic engine’s tunable knobs so a chosen gap metric (versus a reference corpus, or versus a target value) converges. Builds on crate::enhancement::AutoTuner for patch proposals and crate::behavioral_fidelity::compute_report for the loss signal. The “adversarial” framing matches a GAN-style dynamic (generator config θ vs discriminator gap-metric L) but with a fixed analytic discriminator instead of a co-trained model, making the loop trivial to reason about + debug.

See docs/design/2026-05-27-c3-adversarial-calibration-design.md for the broader plan and the validation strategy.

This module ships Piece 1 of the C3 plan:

Pieces 2 (iteration controller), 3 (history persistence), 4 (datasynth-data calibrate CLI), 5 (safety rails) land in follow-up commits.

Structs§

CalibrationConfig
Loop-level config.
CalibrationHistory
Persistable snapshot of a CalibrationLoop’s trajectory.
CalibrationKnob
One tunable engine parameter.
CalibrationLoop
The iteration controller. Owns the knobs, objective, config, and the accumulated history.
CalibrationObjective
One iterable target: which scalar drives the loop + optional convergence threshold.
ClipCounts
EvaluatorError
Generic error wrapper so the iteration loop can propagate underlying engine + IO failures without the loop being tied to any one error type.
GreedyKnobProposer
Coordinate-descent proposer: pick the next knob round-robin, propose its current ± max_step in the direction that last improved (default +). After a step that didn’t improve, flip the direction for that knob. Returns None only when every knob has tried both directions without improvement, which makes the loop stop with ProposerExhausted.
KnobClipDiagnostics
Per-knob diagnostics — currently just clip counts.
OscillationDetector
Detect oscillation on a specific knob across recent steps.
ProposedPatch
What a proposer suggests: a knob to change and what value to try.
RoundRobinProposer
Stateless proposer that just steps the next knob in round-robin order by +max_step each time. Mostly useful as a smoke test fixture (or for the mock-evaluator unit tests).
StepReport
One step’s outcome. Persisted to the history so a long-running loop can resume after interruption.
WallClockBudget
Wall-clock budget — wraps an Instant start time + a budget duration. Loop calls .expired() between steps and breaks when true.

Enums§

HistoryError
Errors from loading a saved history.
KnobBounds
Allowable range for a knob’s value.
KnobClipResult
Outcome of CalibrationKnob::clip: whether the proposed value was inside bounds, or was clamped to a bound.
KnobValue
One value a knob can hold.
ObjectiveMetric
What we’re minimising.
RollbackPolicy
What to do when a step makes the multi-seed mean loss WORSE than the best-seen value (by more than the noise floor).
StepOutcome
What the loop did with the proposed step.

Constants§

HISTORY_SCHEMA_VERSION
Current persistence schema version. Bump on any breaking change to StepReport / KnobValue / CalibrationObjective serde shape. load rejects mismatched versions.

Traits§

Evaluator
What runs a generation + eval for a given knob state + seed.
Proposer
Strategy for choosing the next knob + value to try.