Expand description
Experiment engine for adaptive agent behavior testing and hyperparameter tuning.
Re-exports§
pub use benchmark::BenchmarkCase;pub use benchmark::BenchmarkSet;pub use engine::ExperimentEngine;pub use engine::ExperimentSessionReport;pub use error::EvalError;pub use evaluator::CaseScore;pub use evaluator::EvalReport;pub use evaluator::Evaluator;pub use evaluator::JudgeOutput;pub use generator::VariationGenerator;pub use grid::GridStep;pub use neighborhood::Neighborhood;pub use random::Random;pub use search_space::ParameterRange;pub use search_space::SearchSpace;pub use snapshot::ConfigSnapshot;pub use types::ExperimentResult;pub use types::ExperimentSource;pub use types::ParameterKind;pub use types::Variation;pub use types::VariationValue;
Modules§
- benchmark
- Benchmark dataset types and TOML loading.
- engine
- Experiment engine — core async loop for autonomous parameter tuning.
- error
- Error types for the experiments module.
- evaluator
- LLM-as-judge evaluator for benchmark datasets.
- generator
VariationGeneratortrait for parameter variation strategies.- grid
- Grid sweep strategy for parameter variation.
- neighborhood
- Neighborhood perturbation strategy for parameter variation.
- random
- Uniform random sampling strategy for parameter variation.
- search_
space - Search space definition for parameter variation experiments.
- snapshot
- Config snapshot for a single experiment arm.
- types
Structs§
- Generation
Overrides - Partial LLM generation parameter overrides for experiment variation injection.