pub struct Adam {
pub lr: f32,
pub beta1: f32,
pub beta2: f32,
pub eps: f32,
pub weight_decay: f32,
/* private fields */
}Expand description
Bias-corrected first/second moment optimizer.
Per-tensor state: two f32 buffers (m, v) of the same shape as
the parameter.
Fields§
§lr: f32Learning rate. Typical: 1e-3 for from-scratch CNNs, 1e-4
for transformer fine-tuning.
beta1: f32First-moment EMA decay β₁ ∈ [0, 1). Default 0.9.
beta2: f32Second-moment EMA decay β₂ ∈ [0, 1). Default 0.999.
eps: f32Stability constant in the denominator. Default 1e-8.
weight_decay: f32L2 weight decay coefficient. Folded into the gradient
(the “classic Adam” rule); use crate::AdamW for decoupled
decay. Default 0.0.
Implementations§
Source§impl Adam
impl Adam
Sourcepub fn new(lr: f32) -> Self
pub fn new(lr: f32) -> Self
Construct with the given learning rate and the standard (β₁, β₂, ε) = (0.9, 0.999, 1e-8) defaults.
Sourcepub fn with_betas(self, b1: f32, b2: f32) -> Self
pub fn with_betas(self, b1: f32, b2: f32) -> Self
Override (β₁, β₂).
Sourcepub fn with_weight_decay(self, wd: f32) -> Self
pub fn with_weight_decay(self, wd: f32) -> Self
Override the L2 weight-decay coefficient.
Sourcepub fn current_step(&self) -> u64
pub fn current_step(&self) -> u64
1-based iteration counter. Starts at 1 (so the first call to
step() sees t=1), advances on Optimizer::end_iteration.
Trait Implementations§
Source§impl Optimizer for Adam
impl Optimizer for Adam
fn step( &mut self, name: &str, _shape: &[usize], param: &mut [f32], grad: &[f32], )
Source§fn end_iteration(&mut self)
fn end_iteration(&mut self)
step], so most implementations leave this a no-op.Source§fn lr_scale(&self, _name: &str) -> f32
fn lr_scale(&self, _name: &str) -> f32
1.0 for every name. Override when wrapping this crate to
support per-name LR schedules (e.g. embedding-vs-attention
splits, or the Gaussian-splat attribute-typed LR setup). The
CPU impls in this crate currently honor this only when the
caller passes a pre-scaled lr for the relevant call —
backends are encouraged to consult it inside their fused
kernel.