Skip to main content

Crate axonml_optim

Crate axonml_optim 

Source
Expand description

Optimization algorithms for AxonML neural network training.

Optimizer trait with step, zero_grad, get_lr, set_lr. Implementations: SGD (momentum, Nesterov), Adam, AdamW (decoupled weight decay), RMSprop, LAMB (layer-wise adaptive moments for large batch). GradScaler for AMP loss scaling. Seven LR schedulers (StepLR, MultiStepLR, ExponentialLR, CosineAnnealingLR, OneCycleLR, WarmupLR, ReduceLROnPlateau). Training Health Monitor (health module) for real-time NaN/gradient-explosion/vanishing detection, loss trend analysis, dead neuron tracking, convergence scoring, and automatic learning rate suggestions.

§File

crates/axonml-optim/src/lib.rs

§Author

Andrew Jewell Sr. — AutomataNexus LLC ORCID: 0009-0005-2158-7060

§Updated

April 14, 2026 11:15 PM EST

§Disclaimer

Use at own risk. This software is provided “as is”, without warranty of any kind, express or implied. The author and AutomataNexus shall not be held liable for any damages arising from the use of this software.

Re-exports§

pub use adam::Adam;
pub use adam::AdamW;
pub use grad_scaler::GradScaler;
pub use grad_scaler::GradScalerState;
pub use health::AlertKind;
pub use health::AlertSeverity;
pub use health::HealthReport;
pub use health::LossTrend;
pub use health::MonitorConfig;
pub use health::TrainingAlert;
pub use health::TrainingMonitor;
pub use lamb::LAMB;
pub use lr_scheduler::CosineAnnealingLR;
pub use lr_scheduler::ExponentialLR;
pub use lr_scheduler::LRScheduler;
pub use lr_scheduler::MultiStepLR;
pub use lr_scheduler::OneCycleLR;
pub use lr_scheduler::ReduceLROnPlateau;
pub use lr_scheduler::StepLR;
pub use lr_scheduler::WarmupLR;
pub use optimizer::Optimizer;
pub use rmsprop::RMSprop;
pub use sgd::SGD;

Modules§

adam
Adam and AdamW — adaptive moment estimation optimizers.
grad_scaler
GradScaler — dynamic loss scaling for AMP (mixed-precision) training.
health
Training Health Monitor — a novel AxonML feature for real-time diagnostics.
lamb
LAMB — Layer-wise Adaptive Moments for large-batch training.
lr_scheduler
Learning rate schedulers — seven strategies for LR annealing.
optimizer
Optimizer trait — the core interface for all gradient-based optimizers.
prelude
Common imports for optimization.
rmsprop
RMSprop — root mean square propagation optimizer.
sgd
SGD — Stochastic Gradient Descent with optional momentum and Nesterov.