Structs§
- Benchmark
Case - A single benchmark case.
- Benchmark
Set - A set of benchmark cases loaded from a TOML file.
- Case
Score - Score for a single benchmark case.
- Config
Snapshot - Snapshot of all tunable parameters for a single experiment arm.
- Eval
Report - Aggregate evaluation report returned by
Evaluator::evaluate. - Evaluator
- Evaluates a subject model against a benchmark dataset using an LLM judge.
- Experiment
Engine - Autonomous parameter-tuning engine.
- Experiment
Result - Experiment
Session Report - Final report produced by
ExperimentEngine::run. - Generation
Overrides - Partial LLM generation parameter overrides for experiment variation injection.
- Grid
Step - Systematic grid sweep: iterate each parameter through its discrete steps, skip visited.
- Judge
Output - Structured output returned by the judge LLM.
- Neighborhood
- Perturbation strategy around the current baseline.
- Parameter
Range - A continuous or discrete range for a single tunable parameter.
- Random
- Uniform random sampling within parameter bounds.
- Search
Space - The set of parameter ranges that define the experiment search space.
- Variation
Enums§
- Eval
Error - Errors that can occur during benchmark evaluation.
- Experiment
Source - Parameter
Kind - Variation
Value
Traits§
- Variation
Generator - A strategy for generating parameter variations one at a time.