pub struct Step<E: Env> {
pub act: E::Act,
pub obs: E::Obs,
pub reward: Vec<f32>,
pub is_terminated: Vec<i8>,
pub is_truncated: Vec<i8>,
pub info: E::Info,
pub init_obs: Option<E::Obs>,
}
Expand description
Represents a single step in the environment, containing the action taken, the resulting observation, reward, and episode status.
This struct encapsulates all the information produced by an environment
during a single interaction step. It is used to create transitions of the form
(o_t, a_t, o_t+1, r_t)
for training reinforcement learning agents.
§Type Parameters
E
- The environment type that produced this step
§Fields
act
- The action taken by the agentobs
- The observation received from the environmentreward
- The reward received for the actionis_terminated
- Flags indicating if the episode has terminatedis_truncated
- Flags indicating if the episode has been truncatedinfo
- Additional environment-specific informationinit_obs
- The initial observation of the next episode (if applicable)
§Examples
let step = Step::new(
observation,
action,
vec![0.5], // reward
vec![0], // not terminated
vec![0], // not truncated
info,
None, // no initial observation
);
if step.is_done() {
// Handle episode completion
}
Fields§
§act: E::Act
The action taken by the agent in this step.
obs: E::Obs
The observation received from the environment after taking the action.
reward: Vec<f32>
The reward received for taking the action.
is_terminated: Vec<i8>
Flags indicating if the episode has terminated. A value of 1 indicates termination.
is_truncated: Vec<i8>
Flags indicating if the episode has been truncated. A value of 1 indicates truncation.
info: E::Info
Additional environment-specific information.
init_obs: Option<E::Obs>
The initial observation of the next episode, if applicable. This is used when an episode ends and a new one begins.
Implementations§
Source§impl<E: Env> Step<E>
impl<E: Env> Step<E>
Sourcepub fn new(
obs: E::Obs,
act: E::Act,
reward: Vec<f32>,
is_terminated: Vec<i8>,
is_truncated: Vec<i8>,
info: E::Info,
init_obs: Option<E::Obs>,
) -> Self
pub fn new( obs: E::Obs, act: E::Act, reward: Vec<f32>, is_terminated: Vec<i8>, is_truncated: Vec<i8>, info: E::Info, init_obs: Option<E::Obs>, ) -> Self
Constructs a new Step
object with the given components.
§Arguments
obs
- The observation received from the environmentact
- The action taken by the agentreward
- The reward received for the actionis_terminated
- Flags indicating episode terminationis_truncated
- Flags indicating episode truncationinfo
- Additional environment-specific informationinit_obs
- The initial observation of the next episode
§Returns
A new Step
object containing all the provided information