1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
//! Policy interface for reinforcement learning.
//!
//! This module defines the core interface for policies in reinforcement learning.
//! A policy represents a decision-making strategy that maps observations to actions,
//! which can be either deterministic or stochastic.
use Env;
use Result;
use DeserializeOwned;
use Path;
/// A policy that maps observations to actions in a reinforcement learning environment.
///
/// This trait defines the interface for policies, which are the core decision-making
/// components in reinforcement learning. A policy can be:
/// - Deterministic: Always returns the same action for a given observation
/// - Stochastic: Returns actions sampled from a probability distribution
///
/// # Type Parameters
///
/// * `E` - The environment type that this policy operates on
///
/// # Examples
///
/// A simple deterministic policy might look like:
/// ```ignore
/// struct SimplePolicy;
///
/// impl<E: Env> Policy<E> for SimplePolicy {
/// fn sample(&mut self, obs: &E::Obs) -> E::Act {
/// // Always return the same action for a given observation
/// E::Act::default()
/// }
/// }
/// ```
///
/// A stochastic policy might look like:
/// ```ignore
/// struct StochasticPolicy;
///
/// impl<E: Env> Policy<E> for StochasticPolicy {
/// fn sample(&mut self, obs: &E::Obs) -> E::Act {
/// // Sample an action from a probability distribution
/// // based on the observation
/// E::Act::random()
/// }
/// }
/// ```
/// A trait for objects that can be configured and built from configuration files.
///
/// This trait provides a standardized way to create objects from configuration
/// parameters, either directly or from YAML files. It is commonly used for
/// creating policies, environments, and other components of a reinforcement
/// learning system.
///
/// # Associated Types
///
/// * `Config` - The configuration type that can be deserialized from YAML
///
/// # Examples
///
/// ```ignore
/// #[derive(Clone, Deserialize)]
/// struct MyConfig {
/// learning_rate: f32,
/// hidden_size: usize,
/// }
///
/// struct MyObject {
/// config: MyConfig,
/// }
///
/// impl Configurable for MyObject {
/// type Config = MyConfig;
///
/// fn build(config: Self::Config) -> Self {
/// Self { config }
/// }
/// }
///
/// // Build from a YAML file
/// let obj = MyObject::build_from_path("config.yaml")?;
/// ```