Skip to main content

gen_eval

Function gen_eval 

Source
pub fn gen_eval(
    worker: RuntimeTask,
    eval: RuntimeTask,
    max_iters: usize,
    extract_skill_on_pass: bool,
) -> WorkflowSpec
Expand description

The generate→evaluate quality gate as a workflow: a Loop worker node (the task, re-run up to max_iters, stopping early on a loop_continue=false self-signal) followed by a Verify eval node that scores the worker’s output against the goal/criteria and emits a structured verdict (crate::harness::verdict_output_schema as its output_schema).

This is the declarative substrate form of the former EvalPipeline (0.5.0 fold, OS-axis #6). The eval node is a Verify agent — [role_defaults] gives it ReadOnly + ContextInheritance::None so it does not inherit the worker’s reasoning (bias resistance); it evaluates the worker’s output, carried in via its task goal. The verdict’s passed is the gate.

For the iterative retry-with-feedback variant (re-run the worker with the eval’s feedback folded into the next attempt), the SDK HarnessLoop drives this with the same crate::harness::build_eval_messages / crate::harness::parse_verdict primitives — the kernel Loop re-arms a single node, so per-iteration eval is necessarily SDK-driven.