Expand description
The core optimization loop (Phase 3).
High-level flow (per the approved plan):
- Run the agent on the dataset while collecting rich traces.
- Score outputs (mechanical rules + optional LLM-as-Judge).
- Diagnose failures using a strong model + policy + traces + code bundle.
- Generate N targeted candidate fixes (different focus areas).
- Validate candidates safely (cargo check + clippy + smoke tests in worktree).
- Evaluate survivors on the full dataset.
- Accept only net-positive changes with regression guards + holdout set.
This module is currently a structural skeleton. Real implementations of the individual steps will be filled in as the analysis crate and LLM client mature.
Structs§
- Candidate
- A proposed improvement generated during an optimization iteration.
- Model
Provenance - Optimization
Run - A single optimization experiment / iteration result.
- Optimize
Config - Configuration for a single optimization run.
Enums§
Functions§
- mechanical_
score - Very rough mechanical scorer for the example agent. Gives higher score if the output is not the echo fallback.
- run_
optimization - Placeholder for the full optimization engine. In a real implementation this would orchestrate: