Expand description
§yscv-optim
Optimizers, learning rate schedulers, and gradient clipping for neural network training.
ⓘ
use yscv_optim::*;
let mut optimizer = Adam::new(parameters, 1e-3);
let scheduler = CosineScheduler::new(100, 1e-3, 1e-6);
for epoch in 0..100 {
optimizer.set_lr(scheduler.get_lr(epoch));
optimizer.step();
optimizer.zero_grad();
}§Optimizers (8 + Lookahead meta-optimizer)
Sgd, Adam, AdamW, RmsProp, RAdam, Lars, Lamb, Adagrad, plus Lookahead<O> which wraps any of them.
§LR Schedulers (11)
StepLr, MultiStepLr, ExponentialLr, CosineAnnealingLr, CosineAnnealingWarmRestarts, LinearWarmupLr, PolynomialDecayLr, OneCycleLr, ReduceLrOnPlateau, CyclicLr, LambdaLr.
§Gradient Clipping
clip_grad_norm— L2 norm clippingclip_grad_value— element-wise value clipping
§Tests
76 tests covering optimizer convergence, scheduler curves, clipping behavior.
Structs§
- Adagrad
- Adagrad optimizer with optional L2 weight decay.
- Adam
- Adam optimizer with optional L2 weight decay.
- AdamW
- AdamW optimizer with decoupled weight decay.
- Cosine
Annealing Lr - Cosine annealing learning-rate scheduler.
- Cosine
Annealing Warm Restarts - Cosine annealing with warm restarts learning-rate scheduler.
- Cyclic
Lr - Cyclic learning-rate scheduler with triangular policy.
- Exponential
Lr - Exponential learning-rate scheduler.
- Lamb
- Layer-wise Adaptive Moments optimizer for Batch training (LAMB).
- Lambda
Lr - Lambda learning-rate scheduler.
- Lars
- Layer-wise Adaptive Rate Scaling (LARS) optimizer.
- Linear
Warmup Lr - Linear warmup learning-rate scheduler.
- Lookahead
- Lookahead optimizer wrapper.
- Multi
Step Lr - Multi-step learning-rate scheduler.
- OneCycle
Lr - One-cycle learning-rate scheduler with linear warmup and linear cooldown.
- Polynomial
Decay Lr - Polynomial decay learning-rate scheduler.
- RAdam
- RAdam (Rectified Adam) optimizer with variance rectification.
- Reduce
LrOn Plateau - Reduce learning rate when a metric has stopped improving.
- RmsProp
- RMSProp optimizer with optional momentum, weight decay, and centered variance.
- Sgd
- Stochastic gradient descent optimizer with optional momentum and weight decay.
- StepLr
- Piecewise constant learning-rate scheduler.
Enums§
- Optim
Error - Errors returned by optimizer configuration and update steps.
Constants§
Traits§
- Learning
Rate - Shared learning-rate control surface for optimizers.
- LrScheduler
- Scheduler abstraction for stateful learning-rate policies.
- Step
Optimizer - Trait for optimizers that support a per-parameter
stepupdate.
Functions§
- clip_
grad_ norm_ - Clips the total norm of gradients for the given nodes in-place.
- clip_
grad_ value_ - Clamps every gradient element to the range
[-max_val, max_val]in-place.