pub struct AdamW<A: Float + ScalarOperand + Debug> { /* private fields */ }Expand description
AdamW optimizer
Implements the AdamW optimization algorithm from the paper: “Decoupled Weight Decay Regularization” by Loshchilov and Hutter (2019).
AdamW uses a more principled approach to weight decay compared to standard Adam. The key difference is that weight decay is applied directly to the weights, not within the adaptive learning rate computation.
Formula: m_t = beta1 * m_{t-1} + (1 - beta1) * g_t v_t = beta2 * v_{t-1} + (1 - beta2) * g_t^2 m_hat_t = m_t / (1 - beta1^t) v_hat_t = v_t / (1 - beta2^t) theta_t = theta_{t-1} * (1 - lr * weight_decay) - lr * m_hat_t / (sqrt(v_hat_t) + epsilon)
Note the decoupling of weight decay from the adaptive learning rate computation.
§Examples
use scirs2_core::ndarray::Array1;
use optirs_core::optimizers::{AdamW, Optimizer};
// Initialize parameters and gradients
let params = Array1::zeros(5);
let gradients = Array1::from_vec(vec![0.1, 0.2, -0.3, 0.0, 0.5]);
// Create an AdamW optimizer with default hyperparameters
let mut optimizer = AdamW::new(0.001);
// Update parameters
let new_params = optimizer.step(¶ms, &gradients).unwrap();Implementations§
Source§impl<A: Float + ScalarOperand + Debug + Send + Sync> AdamW<A>
impl<A: Float + ScalarOperand + Debug + Send + Sync> AdamW<A>
Sourcepub fn new(learning_rate: A) -> Self
pub fn new(learning_rate: A) -> Self
Creates a new AdamW optimizer with the given learning rate and default settings
§Arguments
learning_rate- The learning rate for parameter updates
Sourcepub fn new_with_config(
learning_rate: A,
beta1: A,
beta2: A,
epsilon: A,
weight_decay: A,
) -> Self
pub fn new_with_config( learning_rate: A, beta1: A, beta2: A, epsilon: A, weight_decay: A, ) -> Self
Creates a new AdamW optimizer with the full configuration
§Arguments
learning_rate- The learning rate for parameter updatesbeta1- Exponential decay rate for the first moment estimates (default: 0.9)beta2- Exponential decay rate for the second moment estimates (default: 0.999)epsilon- Small constant for numerical stability (default: 1e-8)weight_decay- Weight decay factor (default: 0.01)
Sourcepub fn set_epsilon(&mut self, epsilon: A) -> &mut Self
pub fn set_epsilon(&mut self, epsilon: A) -> &mut Self
Sets the epsilon parameter
Sourcepub fn get_epsilon(&self) -> A
pub fn get_epsilon(&self) -> A
Gets the epsilon parameter
Sourcepub fn set_weight_decay(&mut self, weight_decay: A) -> &mut Self
pub fn set_weight_decay(&mut self, weight_decay: A) -> &mut Self
Sets the weight decay parameter
Sourcepub fn get_weight_decay(&self) -> A
pub fn get_weight_decay(&self) -> A
Gets the weight decay parameter
Sourcepub fn learning_rate(&self) -> A
pub fn learning_rate(&self) -> A
Gets the current learning rate
Trait Implementations§
Source§impl<A, D> Optimizer<A, D> for AdamW<A>
impl<A, D> Optimizer<A, D> for AdamW<A>
Source§fn step(
&mut self,
params: &Array<A, D>,
gradients: &Array<A, D>,
) -> Result<Array<A, D>>
fn step( &mut self, params: &Array<A, D>, gradients: &Array<A, D>, ) -> Result<Array<A, D>>
Source§fn get_learning_rate(&self) -> A
fn get_learning_rate(&self) -> A
Source§fn set_learning_rate(&mut self, learning_rate: A)
fn set_learning_rate(&mut self, learning_rate: A)
Auto Trait Implementations§
impl<A> Freeze for AdamW<A>where
A: Freeze,
impl<A> RefUnwindSafe for AdamW<A>where
A: RefUnwindSafe,
impl<A> Send for AdamW<A>where
A: Send,
impl<A> Sync for AdamW<A>where
A: Sync,
impl<A> Unpin for AdamW<A>where
A: Unpin,
impl<A> UnwindSafe for AdamW<A>where
A: UnwindSafe + RefUnwindSafe,
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.