pub struct Experience<O, A> {
pub observation: O,
pub action: A,
pub reward: f64,
pub next_observation: O,
pub status: EpisodeStatus,
}Expand description
A single transition: (s, a, r, s', status).
The fundamental unit of experience stored in replay buffers and used
for agent updates. Corresponds to one (s, a, r, s’, done) tuple in
classical RL literature, but with a richer status field that
distinguishes natural termination from truncation.
Fields§
§observation: OThe observation at the start of this transition.
action: AThe action taken.
reward: f64The scalar reward received.
next_observation: OThe observation after taking the action.
status: EpisodeStatusWhether the episode ended and why.
Algorithms that bootstrap value estimates (DQN, PPO, SAC) must inspect this to handle terminal states correctly:
Terminated: bootstrap with zero valueTruncated: bootstrap with V(next_observation)Continuing: bootstrap with V(next_observation)
Implementations§
Source§impl<O, A> Experience<O, A>
impl<O, A> Experience<O, A>
pub fn new( observation: O, action: A, reward: f64, next_observation: O, status: EpisodeStatus, ) -> Self
Sourcepub fn bootstrap_mask(&self) -> f64
pub fn bootstrap_mask(&self) -> f64
Returns the bootstrap mask: 1.0 if the episode continues or was
truncated (i.e. the next state has non-zero value), 0.0 if terminated.
Multiply value estimates by this mask when computing TD targets:
target = reward + gamma * bootstrap_mask() * V(next_obs)
Sourcepub fn map_obs<O2>(self, f: impl Fn(O) -> O2) -> Experience<O2, A>
pub fn map_obs<O2>(self, f: impl Fn(O) -> O2) -> Experience<O2, A>
Map the observation to a different type.
Useful for observation-wrapping layers that preprocess before storage.
Sourcepub fn map_action<A2>(self, f: impl Fn(A) -> A2) -> Experience<O, A2>
pub fn map_action<A2>(self, f: impl Fn(A) -> A2) -> Experience<O, A2>
Map the action to a different type.
Trait Implementations§
Source§impl<O: Clone, A: Clone> Clone for Experience<O, A>
impl<O: Clone, A: Clone> Clone for Experience<O, A>
Source§fn clone(&self) -> Experience<O, A>
fn clone(&self) -> Experience<O, A>
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more