1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
//! Multi-armed bandit family — stationary, contextual, non-stationary, and
//! adversarial.
//!
//! All four variants share the [`KArmedBanditAction<K>`] action type and a
//! common configuration shape (`max_steps`, `seed`, plus per-variant knobs).
//! See the per-module docs for the precise reward dynamics.
//!
//! | Module | Environment | Const generics | Reward |
//! |---|---|---|---|
//! | [`k_armed`] | [`KArmedBandit<K>`] | `K` | `N(q*(a), 1)`; means fixed |
//! | [`contextual`] | [`ContextualBandit<C, K>`] | `C` (contexts), `K` | `N(q*(c, a), 1)`; means fixed |
//! | [`non_stationary`] | [`NonStationaryBandit<K>`] | `K` | `N(q*(a), 1)`; `q*(a)` random-walks each step |
//! | [`adversarial`] | [`AdversarialBandit<K>`] | `K` | Deterministic periodic schedule in `[0, amplitude]` |
//!
//! The 10-armed Sutton & Barto §2 testbed is exposed as the
//! [`TenArmedBandit`] type alias for the canonical instance — existing
//! consumers (the `tabular_bandit` benchmark example, the
//! `ten_armed_bandit_training` example, the `bench::suites` factory) are
//! unaffected by the generalisation.
pub use ;
pub use ;
pub use ;
pub use ;
/// Canonical Sutton & Barto §2 ten-armed testbed.
pub type TenArmedBandit = ;
/// Action type for the canonical ten-armed bandit.
pub type TenArmedBanditAction = ;
/// State type for the canonical ten-armed bandit.
pub type TenArmedBanditState = KArmedBanditState;
/// Observation type for the canonical ten-armed bandit.
pub type TenArmedBanditObservation = KArmedBanditObservation;
/// Configuration type for the canonical ten-armed bandit.
pub type TenArmedBanditConfig = KArmedBanditConfig;