Expand description
Evolution engine: the glue that turns primitives into a working A/B loop.
Called from cmd_record (after each session is inserted) and from
cmd_roll (manual challenger generation).
Structs§
- Deployment
State - What was deployed at the start of the most recent session.
Functions§
- collect_
scores_ for_ config - Load all signals tied to sessions that ran under
config_id, group by session, and collapse each session’s signals into a single 0..=1 score. - evaluate_
promotion - Evaluate the running experiment (if any) against the promotion threshold.
Returns the experiment + decision, or
Noneif no experiment is running. - generate_
challenger - Convenience wrapper around
generate_challenger_with_pickerthat uses the default LLM-aware picker. Callers that may not have an LLM should callgenerate_challenger_with_pickerwithpicker_for_environment(false). - generate_
challenger_ with_ picker - Generate a challenger from the current champion using one mutator, persist it as an AgentConfig row with role=Challenger, start a new Experiment with traffic_share=0.5 (proper A/B — half of new sessions go to each variant, decided by the SessionStart hook), and apply the challenger config to disk via the adapter as the initial deployment for the next session.
- handle_
session_ start - Decide which variant to deploy for a fresh session, apply that variant’s config to disk via the adapter, and persist the choice in the project’s deployment-state file.
- picker_
for_ environment - Build the mutator picker, omitting LLM-dependent mutators if no LLM is reachable. Without this, the default 50%-LLM-rewrite weight means roughly half of all challenger generations would silently fail to mutate anything when the user has no Anthropic key and no local Ollama.
- promote_
challenger - Promote the challenger: mark experiment as Promoted, swap project’s champion pointer, and re-apply the new champion to disk via the adapter.
- read_
deployment_ state - Read deployment state for a project, if any.
- resolve_
active_ deployment - Figure out which variant + config_id a new session should be tagged with.
- should_
evolve - Default scheduler: trigger challenger generation when enough sessions have accumulated since the last champion change. Skips if an experiment is already running.
- write_
deployment_ state - Write deployment state for a project.