pub struct TrainingSession<E: Environment, A> { /* private fields */ }Expand description
A self-contained, loop-agnostic training coordinator.
TrainingSession wires together a LearningAgent, an optional
TrainingRun, and a StatsTracker. It is driven purely by incoming
data – it does not own a training loop. Feed it experiences and episode
boundaries from wherever your loop lives: a plain for loop, Bevy’s ECS,
or anything else.
§Usage
let session = TrainingSession::new(agent)
.with_run(TrainingRun::create("cartpole", "v1")?)
.with_max_steps(200_000)
.with_checkpoint_freq(10_000);
// Each environment step:
session.observe(experience);
// Each episode end:
session.on_episode(total_reward, steps, status, env_extras);
if session.is_done() { break; }Implementations§
Source§impl<E, A> TrainingSession<E, A>where
E: Environment,
E::Observation: Clone + Send + Sync + 'static,
E::Action: Clone + Send + Sync + 'static,
A: LearningAgent<E>,
impl<E, A> TrainingSession<E, A>where
E: Environment,
E::Observation: Clone + Send + Sync + 'static,
E::Action: Clone + Send + Sync + 'static,
A: LearningAgent<E>,
Sourcepub fn new(agent: A) -> Self
pub fn new(agent: A) -> Self
Create a session with no run attached.
Stats are tracked in memory but nothing is persisted. Attach a run with
with_run to enable checkpointing and JSONL logging.
Sourcepub fn with_run(self, run: TrainingRun) -> Self
pub fn with_run(self, run: TrainingRun) -> Self
Attach a TrainingRun for checkpointing and JSONL episode logging.
Sourcepub fn with_max_steps(self, n: usize) -> Self
pub fn with_max_steps(self, n: usize) -> Self
Maximum number of steps before is_done() returns true. Default: no limit.
Sourcepub fn with_checkpoint_freq(self, freq: usize) -> Self
pub fn with_checkpoint_freq(self, freq: usize) -> Self
Checkpoint frequency in steps. Default: 10_000.
Sourcepub fn with_keep_checkpoints(self, keep: usize) -> Self
pub fn with_keep_checkpoints(self, keep: usize) -> Self
Number of numbered checkpoints to retain on disk. Default: 3.
Sourcepub fn with_stats(self, stats: StatsTracker) -> Self
pub fn with_stats(self, stats: StatsTracker) -> Self
Replace the default StatsTracker with a custom one.
Sourcepub fn act(&mut self, obs: &E::Observation, mode: ActMode) -> E::Action
pub fn act(&mut self, obs: &E::Observation, mode: ActMode) -> E::Action
Select an action for obs according to mode.
Sourcepub fn observe(&mut self, experience: Experience<E::Observation, E::Action>)
pub fn observe(&mut self, experience: Experience<E::Observation, E::Action>)
Record a transition. Checkpoints + prunes if a step milestone is hit.
Sourcepub fn on_episode(
&mut self,
total_reward: f64,
steps: usize,
status: EpisodeStatus,
env_extras: HashMap<String, f64>,
)
pub fn on_episode( &mut self, total_reward: f64, steps: usize, status: EpisodeStatus, env_extras: HashMap<String, f64>, )
Record an episode boundary.
Merges agent and environment extras into the record, updates stats, and appends to the training JSONL log (if a run is attached).
env_extras should come from [crate::traits::EpisodeStats::episode_extras]
if the environment implements it, or an empty map otherwise.
Sourcepub fn on_episode_start(&mut self)
pub fn on_episode_start(&mut self)
Signal the start of a new episode (resets per-episode agent aggregators).
Sourcepub fn total_steps(&self) -> usize
pub fn total_steps(&self) -> usize
Total environment steps observed so far.
Sourcepub fn steps_per_sec(&self) -> f64
pub fn steps_per_sec(&self) -> f64
Average environment steps per wall-clock second since the session was created.
Sourcepub fn on_eval_episode(&self, record: &EpisodeRecord)
pub fn on_eval_episode(&self, record: &EpisodeRecord)
Log an eval episode to the run (if attached).
Sourcepub fn maybe_save_best(&mut self, mean_reward: f64)
pub fn maybe_save_best(&mut self, mean_reward: f64)
Save best.mpk if mean_reward exceeds the best seen so far.
Sourcepub fn stats_summary(&self) -> HashMap<String, f64>
pub fn stats_summary(&self) -> HashMap<String, f64>
Current stats summary.
Sourcepub fn run(&self) -> Option<&TrainingRun>
pub fn run(&self) -> Option<&TrainingRun>
Read-only access to the run (if attached).
Sourcepub fn into_agent(self) -> A
pub fn into_agent(self) -> A
Consume the session and return the inner agent.
Sourcepub fn eval_report(&self, n_episodes: usize) -> EvalReport
pub fn eval_report(&self, n_episodes: usize) -> EvalReport
Snapshot the current stats as an EvalReport.
Auto Trait Implementations§
impl<E, A> Freeze for TrainingSession<E, A>where
A: Freeze,
impl<E, A> !RefUnwindSafe for TrainingSession<E, A>
impl<E, A> Send for TrainingSession<E, A>
impl<E, A> Sync for TrainingSession<E, A>
impl<E, A> Unpin for TrainingSession<E, A>
impl<E, A> UnsafeUnpin for TrainingSession<E, A>where
A: UnsafeUnpin,
impl<E, A> !UnwindSafe for TrainingSession<E, A>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more