pub struct Sgd {
pub lr: f32,
pub momentum: f32,
pub nesterov: bool,
pub weight_decay: f32,
/* private fields */
}Expand description
SGD with momentum / Nesterov / L2 weight decay.
All hyperparameters are public so callers can hot-swap them between
iterations (e.g. for a warm-up schedule). State is keyed by
parameter name; the same Sgd instance can drive every tensor in
a model.
Fields§
§lr: f32Learning rate. No default — pass it to Sgd::new.
momentum: f32Polyak momentum coefficient ∈ [0, 1). 0.0 disables momentum
entirely (and the per-tensor velocity buffer is still allocated
but unused — set via Sgd::with_momentum if you want it on).
nesterov: boolUse Nesterov-accelerated momentum. Only meaningful when
momentum > 0.
weight_decay: f32L2 weight decay coefficient λ. Folded into the gradient
before the momentum EMA (classical, not decoupled).
Use crate::AdamW-style decoupling if you need that.
Implementations§
Trait Implementations§
Source§impl Optimizer for Sgd
impl Optimizer for Sgd
fn step( &mut self, name: &str, _shape: &[usize], param: &mut [f32], grad: &[f32], )
Source§fn end_iteration(&mut self)
fn end_iteration(&mut self)
Advance the global step counter. Most algorithms increment per
call to [
step], so most implementations leave this a no-op.Source§fn lr_scale(&self, _name: &str) -> f32
fn lr_scale(&self, _name: &str) -> f32
Per-tensor multiplier on the effective learning rate. Default
is
1.0 for every name. Override when wrapping this crate to
support per-name LR schedules (e.g. embedding-vs-attention
splits, or the Gaussian-splat attribute-typed LR setup). The
CPU impls in this crate currently honor this only when the
caller passes a pre-scaled lr for the relevant call —
backends are encouraged to consult it inside their fused
kernel.Auto Trait Implementations§
impl Freeze for Sgd
impl RefUnwindSafe for Sgd
impl Send for Sgd
impl Sync for Sgd
impl Unpin for Sgd
impl UnsafeUnpin for Sgd
impl UnwindSafe for Sgd
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more