1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
//! Inference-only agent interfaces and fitness-evaluation traits.
//!
//! Three traits plus the [`Metric`] data type:
//!
//! - [`BenchableAgent`] — frozen policies (RL-style) that consume an
//! observation and produce an action. RL training/replay state must be
//! stripped in a `Frozen*Policy` adapter before benchmarking.
//! - [`FitnessEvaluable`] — optimizer-on-landscape path, used when the
//! "environment" is a pure fitness function (Rastrigin, Ackley, MPB).
//! - [`Landscape`] — self-evaluating numerical landscape; collapses the
//! evaluator/landscape split when the landscape *is* the fitness
//! function. Consumed by `rlevo-evolution`'s `FromLandscape` adapter.
//!
//! [`Metric`] and [`MetricsProvider`] live here because [`BenchableAgent`]
//! returns `Vec<Metric>` from its optional `emit_metrics` hook; co-locating
//! them keeps the trait self-contained without a dep on the harness crate.
use Rng;
/// Method-specific signal emitted by an agent or aggregator at trial
/// boundaries.
///
/// Each variant carries a `name` that identifies the metric (e.g.
/// `"q_loss"`, `"policy_entropy"`). Names are free-form strings; the harness
/// records them verbatim without normalisation.
/// Trait implemented by agents (and internal collectors) that can report
/// method-specific metrics at trial boundaries.
/// Minimal inference interface required by an external evaluator.
///
/// Implementors must be deterministic given a fixed RNG state and must not
/// mutate learnable parameters. Internal RNG state (e.g. for stochastic
/// policies) may be mutated, which is why `act` takes `&mut self`.
///
/// The `rng` argument is owned by the harness so reproducibility is
/// guaranteed regardless of the agent's internal RNG discipline.
/// Evaluates an optimizer against a fitness landscape.
///
/// Used when the benchmark IS the fitness function (e.g. Rastrigin) rather
/// than a stateful `Environment`. The `Evaluator::run_optimizer_trial` path
/// in `rlevo-benchmarks` consumes this trait.
/// Self-evaluating numerical fitness landscape.
///
/// Implementors carry both the parameters of the landscape (dimension,
/// constants) and the scalar `f(x)` evaluation. Use this when the
/// landscape *is* the fitness function — Sphere, Ackley, Rastrigin — so
/// callers do not need to define a separate evaluator alongside a marker
/// landscape type as `FitnessEvaluable` requires.
///
/// `rlevo-evolution`'s `FromLandscape` adapter wraps any `Landscape` into
/// a `BatchFitnessFn<B, Tensor<B, 2>>`, mirroring the row-by-row host
/// evaluation that `FromFitnessEvaluable` performs.