pub struct SkillEvaluationConfig {
pub enabled: bool,
pub provider: ProviderName,
pub quality_threshold: f32,
pub weight_correctness: f32,
pub weight_reusability: f32,
pub weight_specificity: f32,
pub fail_open_on_error: bool,
pub timeout_ms: u64,
}Expand description
External-feedback skill evaluator configuration, nested under [skills.evaluation] in TOML.
When enabled = true, generated SKILL.md files are scored by a critic LLM before being
written to disk. Skills below quality_threshold are rejected.
§Weights
weight_correctness + weight_reusability + weight_specificity must equal 1.0 ± 1e-3.
Starting defaults (0.50 / 0.25 / 0.25) are intuition-based and will be tuned after
real-world telemetry is collected.
§Example (TOML)
[skills.evaluation]
enabled = true
provider = "fast"
quality_threshold = 0.60
fail_open_on_error = true
timeout_ms = 15000Fields§
§enabled: boolEnable the evaluator gate. Default: false.
provider: ProviderNameProvider name for the critic LLM. Empty = primary provider.
quality_threshold: f32Minimum composite score required to accept a generated skill. Default: 0.60.
weight_correctness: f32Weight for correctness in the composite score. Default: 0.50.
weight_reusability: f32Weight for reusability in the composite score. Default: 0.25.
weight_specificity: f32Weight for specificity in the composite score. Default: 0.25.
fail_open_on_error: boolFail-open policy: accept skill when the evaluator call fails. Default: true.
timeout_ms: u64Maximum wait for the critic LLM in milliseconds. Default: 15000.