pub struct RAdam {
pub lr: f32,
pub beta1: f32,
pub beta2: f32,
pub eps: f32,
pub weight_decay: f32,
/* private fields */
}Expand description
Rectified Adam. Per-tensor state: two f32 buffers.
Fields§
§lr: f32Learning rate.
beta1: f32First-moment EMA decay β₁. Default 0.9.
beta2: f32Second-moment EMA decay β₂. Default 0.999.
eps: f32Denominator stability constant. Default 1e-8.
weight_decay: f32L2 weight-decay coefficient (folded into the gradient — like
classical Adam, not decoupled). Default 0.0.
Implementations§
Trait Implementations§
Source§impl Optimizer for RAdam
impl Optimizer for RAdam
fn step( &mut self, name: &str, _shape: &[usize], param: &mut [f32], grad: &[f32], )
Source§fn end_iteration(&mut self)
fn end_iteration(&mut self)
Advance the global step counter. Most algorithms increment per
call to [
step], so most implementations leave this a no-op.Source§fn lr_scale(&self, _name: &str) -> f32
fn lr_scale(&self, _name: &str) -> f32
Per-tensor multiplier on the effective learning rate. Default
is
1.0 for every name. Override when wrapping this crate to
support per-name LR schedules (e.g. embedding-vs-attention
splits, or the Gaussian-splat attribute-typed LR setup). The
CPU impls in this crate currently honor this only when the
caller passes a pre-scaled lr for the relevant call —
backends are encouraged to consult it inside their fused
kernel.Auto Trait Implementations§
impl Freeze for RAdam
impl RefUnwindSafe for RAdam
impl Send for RAdam
impl Sync for RAdam
impl Unpin for RAdam
impl UnsafeUnpin for RAdam
impl UnwindSafe for RAdam
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more