pub enum EpisodeStatus {
Continuing,
Terminated,
Truncated,
}Expand description
Whether an episode is ongoing, has naturally ended, or was cut short.
This distinction is critical for bootstrapping in RL algorithms.
§Why this matters
When computing value targets (e.g. TD targets, GAE), the treatment of the terminal state depends on why the episode ended:
-
Terminated: the agent reached a natural terminal state. The value of the next state is zero — there is no future reward to bootstrap. -
Truncated: the episode was cut short by something external (e.g. a time limit, the agent going out of bounds). The environment has not actually terminated — the agent simply stopped. The value of the next state is non-zero and must be bootstrapped from the value function.
Confusing these two is one of the most common bugs in policy gradient implementations. Gymnasium introduced this distinction in v0.26; we encode it correctly from the start.
Variants§
Continuing
The episode is ongoing.
Terminated
The episode reached a natural terminal state (MDP termination).
Bootstrap target: r + gamma * 0 — no future value.
Truncated
The episode was cut short by an external condition (e.g. time limit).
Bootstrap target: r + gamma * V(s') — future value is non-zero.
Implementations§
Source§impl EpisodeStatus
impl EpisodeStatus
Sourcepub fn is_done(&self) -> bool
pub fn is_done(&self) -> bool
Returns true if the episode is over for any reason.
Examples found in repository?
74 fn step(&mut self, actions: HashMap<u8, u8>)
75 -> HashMap<u8, StepResult<PursuitObs, ()>>
76 {
77 for (&id, &action) in &actions {
78 self.pos[id as usize] = Self::clamp_move(self.pos[id as usize], action);
79 }
80
81 // Prey moves randomly, bouncing at the walls.
82 self.prey = Self::clamp_move(self.prey, self.rng.gen_range(0..2));
83 self.step += 1;
84
85 let caught = self.active.iter().any(|&id| self.pos[id as usize] == self.prey);
86
87 let status = if caught {
88 EpisodeStatus::Terminated
89 } else if self.step >= MAX_STEPS {
90 EpisodeStatus::Truncated
91 } else {
92 EpisodeStatus::Continuing
93 };
94
95 // Build results before mutating the active list.
96 let results = self.active.iter().map(|&id| {
97 let reward = if caught && self.pos[id as usize] == self.prey { 1.0 } else { 0.0 };
98 (id, StepResult::new(self.obs(id), reward, status.clone(), ()))
99 }).collect();
100
101 if status.is_done() {
102 self.active.clear();
103 }
104
105 results
106 }
107
108 fn reset(&mut self, seed: Option<u64>) -> HashMap<u8, (PursuitObs, ())> {
109 if let Some(s) = seed {
110 self.rng = SmallRng::seed_from_u64(s);
111 }
112 self.active = vec![0, 1];
113 self.pos = [0, GRID_LEN - 1];
114 self.prey = GRID_LEN / 2;
115 self.step = 0;
116 [0u8, 1u8].iter().map(|&id| (id, (self.obs(id), ()))).collect()
117 }
118
119 fn sample_action(&self, _agent: &u8, rng: &mut impl Rng) -> u8 {
120 rng.gen_range(0..2)
121 }
122}
123
124// ── Demo loop ────────────────────────────────────────────────────────────────
125
126fn run_episode(env: &mut Pursuit, rng: &mut SmallRng) -> ([f64; 2], EpisodeStatus, usize) {
127 env.reset(None);
128 let mut returns = [0.0_f64; 2];
129 let mut steps = 0;
130 let mut outcome = EpisodeStatus::Continuing;
131
132 while !env.is_done() {
133 let actions = env.agents().iter()
134 .map(|&id| (id, env.sample_action(&id, rng)))
135 .collect();
136
137 let results = env.step(actions);
138 steps += 1;
139
140 for (id, result) in &results {
141 returns[*id as usize] += result.reward;
142 if result.status.is_done() {
143 outcome = result.status.clone();
144 }
145 }
146 }
147
148 (returns, outcome, steps)
149}Sourcepub fn is_terminal(&self) -> bool
pub fn is_terminal(&self) -> bool
Returns true only for natural MDP termination.
Use this to decide whether to bootstrap the next-state value.
Sourcepub fn is_truncated(&self) -> bool
pub fn is_truncated(&self) -> bool
Returns true if the episode was cut short externally.
Trait Implementations§
Source§impl Clone for EpisodeStatus
impl Clone for EpisodeStatus
Source§fn clone(&self) -> EpisodeStatus
fn clone(&self) -> EpisodeStatus
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more