pub struct TrainingRun {
pub metadata: RunMetadata,
/* private fields */
}Expand description
Manages the on-disk artefacts for a single training run.
Directory layout:
runs/<name>/<version>/<YYYYMMDD_HHMMSS>/
metadata.json ← name, version, step counts, timestamps
config.json ← serialized hyperparams (written by caller)
checkpoints/
step_<N>.mpk ← periodic checkpoints
latest.mpk ← symlink-equivalent: overwritten each checkpoint
best.mpk ← best eval-reward checkpoint
train_episodes.jsonl ← one EpisodeRecord per line (training)
eval_episodes.jsonl ← one tagged EpisodeRecord per line (eval)TrainingRun is not generic over the neural network backend. It manages
directories and JSON; the caller (e.g. DqnTrainer) handles actual
network serialization by using the paths returned by the checkpoint methods.
§Usage
// Start a new run
let run = TrainingRun::create("cartpole", "v1")?;
run.write_config(&(&config, &encoder, &mapper))?;
// During training
run.log_train_episode(&episode_record)?;
run.update_metadata(total_steps, total_episodes)?;
// (save network to run.checkpoint_path(step) yourself)
// Resume
let run = TrainingRun::resume("runs/cartpole/v1")?; // picks latestFields§
§metadata: RunMetadataLoaded/created metadata.
Implementations§
Source§impl TrainingRun
impl TrainingRun
Sourcepub fn create(
name: impl Into<String>,
version: impl Into<String>,
) -> Result<Self>
pub fn create( name: impl Into<String>, version: impl Into<String>, ) -> Result<Self>
Create a brand-new run directory under runs/<name>/<version>/<timestamp>/.
Returns an error if the directory cannot be created or metadata cannot be written.
Sourcepub fn resume(base_path: impl AsRef<Path>) -> Result<Self>
pub fn resume(base_path: impl AsRef<Path>) -> Result<Self>
Resume the most recent run found under base_path.
base_path can be:
- An exact run directory (
runs/cartpole/v1/20260322_120000) – used directly. - A name/version directory (
runs/cartpole/v1) – picks the lexicographically latest subdirectory (timestamps sort correctly). - A name directory (
runs/cartpole) – picks latest version, then latest run.
Returns an error if no run is found or metadata.json is missing/corrupt.
Sourcepub fn write_config<T: Serialize>(&self, config: &T) -> Result<()>
pub fn write_config<T: Serialize>(&self, config: &T) -> Result<()>
Write an arbitrary serialisable value to config.json.
Typically called once after create with a tuple of
(&config, &encoder, &action_mapper).
Sourcepub fn checkpoint_path(&self, step: usize) -> PathBuf
pub fn checkpoint_path(&self, step: usize) -> PathBuf
Path for a numbered checkpoint: checkpoints/step_<N>.mpk.
Pass this to DqnAgent::save (or network.save_file).
Sourcepub fn latest_checkpoint_path(&self) -> PathBuf
pub fn latest_checkpoint_path(&self) -> PathBuf
Path for the rolling “latest” checkpoint: checkpoints/latest.mpk.
Overwrite this on every checkpoint save so users can always resume from the most recent state without knowing the step number.
Sourcepub fn best_checkpoint_path(&self) -> PathBuf
pub fn best_checkpoint_path(&self) -> PathBuf
Path for the best-eval-reward checkpoint: checkpoints/best.mpk.
Sourcepub fn prune_checkpoints(&self, keep: usize) -> Result<()>
pub fn prune_checkpoints(&self, keep: usize) -> Result<()>
Delete old numbered checkpoints, keeping the keep most recent.
latest.mpk and best.mpk are never deleted.
Sourcepub fn log_train_episode(&self, record: &EpisodeRecord) -> Result<()>
pub fn log_train_episode(&self, record: &EpisodeRecord) -> Result<()>
Append an episode record to train_episodes.jsonl.
Sourcepub fn log_eval_episode(
&self,
record: &EpisodeRecord,
total_steps: usize,
) -> Result<()>
pub fn log_eval_episode( &self, record: &EpisodeRecord, total_steps: usize, ) -> Result<()>
Append an episode record (tagged with total_steps_at_eval) to eval_episodes.jsonl.
Auto Trait Implementations§
impl Freeze for TrainingRun
impl RefUnwindSafe for TrainingRun
impl Send for TrainingRun
impl Sync for TrainingRun
impl Unpin for TrainingRun
impl UnsafeUnpin for TrainingRun
impl UnwindSafe for TrainingRun
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more