Struct TrainingMonitor

Source

pub struct TrainingMonitor { /* private fields */ }

Expand description

Real-time training health monitor.

Attaches to any optimizer to detect pathological training behavior including gradient explosion/vanishing, loss divergence, dead neurons, and convergence. Generates alerts with severity levels and provides actionable suggestions.

Implementations§

Source §

impl TrainingMonitor

Source

pub fn new() -> Self

Creates a new training monitor with default configuration.

Source

pub fn with_config(config: MonitorConfig) -> Self

Creates a new training monitor with the given configuration.

Source

pub fn record_step(&mut self, loss: f32, grad_norms: &[(&str, f32)], lr: f32)

Records a single training step.

§Arguments

loss - The loss value for this step.
grad_norms - Slice of (parameter_name, gradient_norm) pairs.
lr - The current learning rate.

Source

pub fn check_health(&self) -> HealthReport

Returns a full health report for the current training state.

Source

pub fn is_healthy(&self) -> bool

Returns true if training appears healthy (no critical alerts, loss not diverging).

Source

pub fn alerts(&self) -> &[TrainingAlert]

Returns the accumulated alerts.

Source

pub fn clear_alerts(&mut self)

Clears all accumulated alerts.

Source

pub fn loss_trend(&self) -> LossTrend

Analyzes the loss trajectory over the recent window.

Compares the rolling average of the most recent window to the rolling average of the previous window to classify the trend.

Source

pub fn suggest_lr(&self) -> Option<f32>

Suggests a learning rate adjustment based on current training dynamics.

Returns None if no adjustment is needed or training has converged.

Source

pub fn grad_norm_stats(&self) -> (f32, f32, f32)

Returns (mean, std, max) of gradient norms over the recent window.

Source

pub fn convergence_score(&self) -> f32

Returns a convergence score between 0.0 and 1.0.

1.0 indicates full convergence (no loss change over the window). 0.0 indicates the loss is still actively changing.

Source

pub fn summary(&self) -> String

Returns a human-readable summary of the current training state.

Trait Implementations§

Source §

impl Default for TrainingMonitor

Source §

fn default() -> Self

Returns the “default value” for a type. Read more

Auto Trait Implementations§

§

impl UnwindSafe for TrainingMonitor

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> IntoEither for T

Source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §