pub struct Lion {
pub lr: f32,
pub beta1: f32,
pub beta2: f32,
pub weight_decay: f32,
/* private fields */
}Expand description
EvoLved sign-momentum optimizer.
Per-tensor state: one f32 buffer (half of Adam’s footprint).
Fields§
§lr: f32Learning rate. Critical: typically 3–10× smaller than the
AdamW LR you’d use on the same model (because the update has
unit ‖sign(·)‖ per coordinate).
beta1: f32Interpolation coefficient for the update direction (β₁ in
Chen et al.). Default 0.9.
beta2: f32EMA coefficient for the carried momentum (β₂). Default 0.99.
weight_decay: f32Decoupled weight-decay coefficient λ. Tune ~3–10× higher than
the AdamW λ you’d pair with the same model. Default 0.0.
Implementations§
Source§impl Lion
impl Lion
Sourcepub fn with_betas(self, b1: f32, b2: f32) -> Self
pub fn with_betas(self, b1: f32, b2: f32) -> Self
Override (β₁, β₂). They serve different roles — see the struct-level docs.
Sourcepub fn with_weight_decay(self, wd: f32) -> Self
pub fn with_weight_decay(self, wd: f32) -> Self
Override the decoupled-decay coefficient.
Trait Implementations§
Source§impl Optimizer for Lion
impl Optimizer for Lion
fn step( &mut self, name: &str, _shape: &[usize], param: &mut [f32], grad: &[f32], )
Source§fn end_iteration(&mut self)
fn end_iteration(&mut self)
Advance the global step counter. Most algorithms increment per
call to [
step], so most implementations leave this a no-op.Source§fn lr_scale(&self, _name: &str) -> f32
fn lr_scale(&self, _name: &str) -> f32
Per-tensor multiplier on the effective learning rate. Default
is
1.0 for every name. Override when wrapping this crate to
support per-name LR schedules (e.g. embedding-vs-attention
splits, or the Gaussian-splat attribute-typed LR setup). The
CPU impls in this crate currently honor this only when the
caller passes a pre-scaled lr for the relevant call —
backends are encouraged to consult it inside their fused
kernel.Auto Trait Implementations§
impl Freeze for Lion
impl RefUnwindSafe for Lion
impl Send for Lion
impl Sync for Lion
impl Unpin for Lion
impl UnsafeUnpin for Lion
impl UnwindSafe for Lion
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more