pub struct KronPsgd {
pub lr: f32,
pub precond_lr: f32,
pub momentum: f32,
pub weight_decay: f32,
pub eps: f32,
pub clip: f32,
/* private fields */
}Expand description
Kron-PSGD — Kronecker-factored preconditioned SGD.
Fields§
§lr: f32Learning rate.
precond_lr: f32Learning rate for the preconditioner update (Lie-group
descent on Q_L / Q_R). Default 0.1. Too high ⇒ Q drifts;
too low ⇒ preconditioner lags.
momentum: f32Polyak momentum for the preconditioned-gradient SGD step.
Default 0.9.
weight_decay: f32L2 weight-decay coefficient (folded into the gradient).
Default 0.0.
eps: f32Numerical floor on the preconditioner-update normalizer.
Default 1e-8.
clip: f32Cap the per-coordinate magnitude of the preconditioned update
(defensive — early Q estimates can be ill-conditioned). Default 1.0.
Implementations§
Trait Implementations§
Source§impl Optimizer for KronPsgd
impl Optimizer for KronPsgd
fn step(&mut self, name: &str, shape: &[usize], param: &mut [f32], grad: &[f32])
Source§fn end_iteration(&mut self)
fn end_iteration(&mut self)
Advance the global step counter. Most algorithms increment per
call to [
step], so most implementations leave this a no-op.Source§fn lr_scale(&self, _name: &str) -> f32
fn lr_scale(&self, _name: &str) -> f32
Per-tensor multiplier on the effective learning rate. Default
is
1.0 for every name. Override when wrapping this crate to
support per-name LR schedules (e.g. embedding-vs-attention
splits, or the Gaussian-splat attribute-typed LR setup). The
CPU impls in this crate currently honor this only when the
caller passes a pre-scaled lr for the relevant call —
backends are encouraged to consult it inside their fused
kernel.Auto Trait Implementations§
impl Freeze for KronPsgd
impl RefUnwindSafe for KronPsgd
impl Send for KronPsgd
impl Sync for KronPsgd
impl Unpin for KronPsgd
impl UnsafeUnpin for KronPsgd
impl UnwindSafe for KronPsgd
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more