Skip to main content

Module experiments

Module experiments 

Source

Structs§

BenchmarkCase
A single benchmark case.
BenchmarkSet
A set of benchmark cases loaded from a TOML file.
CaseScore
Score for a single benchmark case.
ConfigSnapshot
Snapshot of all tunable parameters for a single experiment arm.
EvalReport
Aggregate evaluation report returned by Evaluator::evaluate.
Evaluator
Evaluates a subject model against a benchmark dataset using an LLM judge.
ExperimentEngine
Autonomous parameter-tuning engine.
ExperimentResult
ExperimentSessionReport
Final report produced by ExperimentEngine::run.
GenerationOverrides
Partial LLM generation parameter overrides for experiment variation injection.
GridStep
Systematic grid sweep: iterate each parameter through its discrete steps, skip visited.
JudgeOutput
Structured output returned by the judge LLM.
Neighborhood
Perturbation strategy around the current baseline.
ParameterRange
A continuous or discrete range for a single tunable parameter.
Random
Uniform random sampling within parameter bounds.
SearchSpace
The set of parameter ranges that define the experiment search space.
Variation

Enums§

EvalError
Errors that can occur during benchmark evaluation.
ExperimentSource
ParameterKind
VariationValue

Traits§

VariationGenerator
A strategy for generating parameter variations one at a time.