Trait LearningAgent

Source

pub trait LearningAgent<E: Environment>: Checkpointable {
    // Required methods
    fn act(&mut self, obs: &E::Observation, mode: ActMode) -> E::Action;
    fn observe(&mut self, experience: Experience<E::Observation, E::Action>);
    fn total_steps(&self) -> usize;

    // Provided methods
    fn episode_extras(&self) -> HashMap<String, f64> { ... }
    fn on_episode_start(&mut self) { ... }
}

Expand description

An agent that can act, learn from experience, and report training stats.

Implemented by all algorithm agents (DqnAgent, future PpoAgent, etc.). The agent owns its exploration RNG internally – no external RNG is needed at call sites.

§Episode extras

Algorithms should maintain internal aggregators (e.g. Mean, Std, Max from crate::stats) over per-step values during each episode, reset them at episode start, and report summaries via episode_extras. These are merged into crate::stats::EpisodeRecord::extras automatically by crate::training::TrainingSession.

Example extras a DQN agent might report:

{ "epsilon": 0.12, "loss_mean": 0.043, "loss_std": 0.012, "loss_max": 0.21 }

Required Methods§

Source

fn act(&mut self, obs: &E::Observation, mode: ActMode) -> E::Action

Select an action for obs according to mode.

Source

fn observe(&mut self, experience: Experience<E::Observation, E::Action>)

Record a transition and update the agent’s internal state.

Source

fn total_steps(&self) -> usize

Total number of observe calls since construction.

Provided Methods§

Source

fn episode_extras(&self) -> HashMap<String, f64>

Per-episode aggregates of step-level values, reported at episode end.

The default implementation returns an empty map. Algorithms override this to expose training dynamics (loss statistics, epsilon, etc.).

Source

fn on_episode_start(&mut self)

Called by crate::training::TrainingSession at the start of each episode so the agent can reset its per-episode aggregators.

Dyn Compatibility§

This trait is not dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.

Implementors§

Source §

impl<E, Enc, Act, B> LearningAgent<E> for PpoAgent<E, Enc, Act, B>
where E: Environment, E::Observation: Clone + Send + Sync + 'static, E::Action: Clone + Send + Sync + 'static, Enc: ObservationEncoder<E::Observation, B> + ObservationEncoder<E::Observation, B::InnerBackend>, Act: DiscreteActionMapper<E::Action>, B: AutodiffBackend,

Source §

impl<E, Enc, Act, B, Buf> LearningAgent<E> for DqnAgent<E, Enc, Act, B, Buf>
where E: Environment, E::Observation: Clone + Send + Sync + 'static, E::Action: Clone + Send + Sync + 'static, Enc: ObservationEncoder<E::Observation, B> + ObservationEncoder<E::Observation, B::InnerBackend>, Act: DiscreteActionMapper<E::Action>, B: AutodiffBackend, Buf: ReplayBuffer<E::Observation, E::Action>,

Trait LearningAgent Copy item path

§Episode extras

Required Methods§

fn act(&mut self, obs: &E::Observation, mode: ActMode) -> E::Action

fn observe(&mut self, experience: Experience<E::Observation, E::Action>)

fn total_steps(&self) -> usize

Provided Methods§

fn episode_extras(&self) -> HashMap<String, f64>

fn on_episode_start(&mut self)

Dyn Compatibility§

Implementors§

Trait LearningAgent