Module adam

Source
Expand description

ADAM (Adaptive Moment Estimation) optimizer

ADAM combines the advantages of two other extensions of stochastic gradient descent: AdaGrad and RMSProp. It computes adaptive learning rates for each parameter and stores an exponentially decaying average of past gradients (momentum) and past squared gradients (adaptive learning rate).

Structs§

AdamOptions
Options for ADAM optimization

Functions§

minimize_adam
ADAM optimizer implementation
minimize_adam_with_warmup
ADAM with learning rate warmup