descend
Training infrastructure primitives: optimizers, LR schedules, gradient clipping, early stopping.
Quickstart
[]
= "0.1"
use ;
// Adam optimizer
let mut opt = new;
let grads = vec!;
let params = vec!;
let updated = opt.step;
// LR schedule
let schedule = new;
let lr = schedule.lr;
Features
- Optimizers: SGD, Adam, AdaGrad, RMSProp, Lion, LAMB
- LR schedules: linear warmup, cosine annealing, one-cycle, step decay, inverse sqrt
- Gradient clipping: by norm and by value
- Early stopping with patience and delta
- Gradient accumulation
- Exponential moving average
License
MIT OR Apache-2.0