Expand description
Optimization algorithms for AxonML neural network training.
Optimizer trait with step, zero_grad, get_lr, set_lr.
Implementations: SGD (momentum, Nesterov), Adam, AdamW (decoupled
weight decay), RMSprop, LAMB (layer-wise adaptive moments for large
batch). GradScaler for AMP loss scaling. Seven LR schedulers (StepLR,
MultiStepLR, ExponentialLR, CosineAnnealingLR, OneCycleLR, WarmupLR,
ReduceLROnPlateau). Training Health Monitor (health module) for real-time
NaN/gradient-explosion/vanishing detection, loss trend analysis, dead neuron
tracking, convergence scoring, and automatic learning rate suggestions.
§File
crates/axonml-optim/src/lib.rs
§Author
Andrew Jewell Sr. — AutomataNexus LLC ORCID: 0009-0005-2158-7060
§Updated
April 14, 2026 11:15 PM EST
§Disclaimer
Use at own risk. This software is provided “as is”, without warranty of any kind, express or implied. The author and AutomataNexus shall not be held liable for any damages arising from the use of this software.
Re-exports§
pub use adam::Adam;pub use adam::AdamW;pub use grad_scaler::GradScaler;pub use grad_scaler::GradScalerState;pub use health::AlertKind;pub use health::AlertSeverity;pub use health::HealthReport;pub use health::LossTrend;pub use health::MonitorConfig;pub use health::TrainingAlert;pub use health::TrainingMonitor;pub use lamb::LAMB;pub use lr_scheduler::CosineAnnealingLR;pub use lr_scheduler::ExponentialLR;pub use lr_scheduler::LRScheduler;pub use lr_scheduler::MultiStepLR;pub use lr_scheduler::OneCycleLR;pub use lr_scheduler::ReduceLROnPlateau;pub use lr_scheduler::StepLR;pub use lr_scheduler::WarmupLR;pub use optimizer::Optimizer;pub use rmsprop::RMSprop;pub use sgd::SGD;
Modules§
- adam
AdamandAdamW— adaptive moment estimation optimizers.- grad_
scaler GradScaler— dynamic loss scaling for AMP (mixed-precision) training.- health
- Training Health Monitor — a novel AxonML feature for real-time diagnostics.
- lamb
LAMB— Layer-wise Adaptive Moments for large-batch training.- lr_
scheduler - Learning rate schedulers — seven strategies for LR annealing.
- optimizer
Optimizertrait — the core interface for all gradient-based optimizers.- prelude
- Common imports for optimization.
- rmsprop
RMSprop— root mean square propagation optimizer.- sgd
SGD— Stochastic Gradient Descent with optional momentum and Nesterov.