pub struct Step<E: Env> {
pub act: E::Act,
pub obs: E::Obs,
pub reward: Vec<f32>,
pub is_terminated: Vec<i8>,
pub is_truncated: Vec<i8>,
pub info: E::Info,
pub init_obs: Option<E::Obs>,
}Expand description
Represents a single step in the environment, containing the action taken, the resulting observation, reward, and episode status.
This struct encapsulates all the information produced by an environment
during a single interaction step. It is used to create transitions of the form
(o_t, a_t, o_t+1, r_t) for training reinforcement learning agents.
§Type Parameters
E- The environment type that produced this step
§Fields
act- The action taken by the agentobs- The observation received from the environmentreward- The reward received for the actionis_terminated- Flags indicating if the episode has terminatedis_truncated- Flags indicating if the episode has been truncatedinfo- Additional environment-specific informationinit_obs- The initial observation of the next episode (if applicable)
§Examples
let step = Step::new(
observation,
action,
vec![0.5], // reward
vec![0], // not terminated
vec![0], // not truncated
info,
None, // no initial observation
);
if step.is_done() {
// Handle episode completion
}Fields§
§act: E::ActThe action taken by the agent in this step.
obs: E::ObsThe observation received from the environment after taking the action.
reward: Vec<f32>The reward received for taking the action.
is_terminated: Vec<i8>Flags indicating if the episode has terminated. A value of 1 indicates termination.
is_truncated: Vec<i8>Flags indicating if the episode has been truncated. A value of 1 indicates truncation.
info: E::InfoAdditional environment-specific information.
init_obs: Option<E::Obs>The initial observation of the next episode, if applicable. This is used when an episode ends and a new one begins.
Implementations§
Source§impl<E: Env> Step<E>
impl<E: Env> Step<E>
Sourcepub fn new(
obs: E::Obs,
act: E::Act,
reward: Vec<f32>,
is_terminated: Vec<i8>,
is_truncated: Vec<i8>,
info: E::Info,
init_obs: Option<E::Obs>,
) -> Self
pub fn new( obs: E::Obs, act: E::Act, reward: Vec<f32>, is_terminated: Vec<i8>, is_truncated: Vec<i8>, info: E::Info, init_obs: Option<E::Obs>, ) -> Self
Constructs a new Step object with the given components.
§Arguments
obs- The observation received from the environmentact- The action taken by the agentreward- The reward received for the actionis_terminated- Flags indicating episode terminationis_truncated- Flags indicating episode truncationinfo- Additional environment-specific informationinit_obs- The initial observation of the next episode
§Returns
A new Step object containing all the provided information