axonml-optim
Overview
axonml-optim provides optimization algorithms for training neural networks in the AxonML framework. It includes popular gradient-based optimizers with momentum, adaptive learning rates, and comprehensive learning rate scheduling strategies.
Features
- SGD - Stochastic Gradient Descent with optional momentum, Nesterov acceleration, weight decay, and dampening
- Adam - Adaptive Moment Estimation with bias correction and optional AMSGrad variant
- AdamW - Adam with decoupled weight decay regularization for improved generalization
- RMSprop - Root Mean Square Propagation with optional momentum and centered gradient normalization
- Learning Rate Schedulers - Comprehensive scheduling including StepLR, MultiStepLR, ExponentialLR, CosineAnnealingLR, OneCycleLR, WarmupLR, and ReduceLROnPlateau
- Builder Pattern - Fluent API for configuring optimizer hyperparameters
- Unified Interface - Common
Optimizertrait for interoperability
Modules
| Module | Description |
|---|---|
optimizer |
Core Optimizer trait and ParamState for parameter state management |
sgd |
Stochastic Gradient Descent with momentum and Nesterov acceleration |
adam |
Adam and AdamW optimizers with adaptive learning rates |
rmsprop |
RMSprop optimizer with optional centering and momentum |
lr_scheduler |
Learning rate scheduling strategies for training dynamics |
Usage
Add to your Cargo.toml:
[]
= "0.1.0"
Basic Training Loop
use *;
use ;
use Variable;
use Tensor;
// Create model
let model = new
.add
.add;
// Create optimizer
let mut optimizer = new;
let loss_fn = new;
// Training loop
for epoch in 0..100
SGD with Momentum
use ;
// Basic SGD
let mut optimizer = SGDnew;
// SGD with momentum
let mut optimizer = SGDnew
.momentum
.weight_decay
.nesterov;
Adam with Custom Configuration
use ;
// Adam with custom betas
let mut optimizer = new
.betas
.eps
.weight_decay
.amsgrad;
// AdamW for decoupled weight decay
let mut optimizer = new
.weight_decay;
Learning Rate Scheduling
use ;
let mut optimizer = SGDnew;
// Step decay every 10 epochs
let mut scheduler = new;
// Cosine annealing
let mut scheduler = new;
// One-cycle policy for super-convergence
let mut scheduler = new;
// In training loop
for epoch in 0..epochs
ReduceLROnPlateau
use ;
let mut optimizer = SGDnew;
let mut scheduler = with_options;
// Step with validation loss
scheduler.step_with_metric;
Tests
Run the test suite:
License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT License (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.