pub struct Adam<B: Backend> {
pub learning_rate: f32,
pub betas: (f32, f32),
pub eps: f32,
pub weight_decay: f32,
pub amsgrad: bool,
pub maximize: bool,
pub m: Vec<Tensor<B>>,
pub v: Vec<Tensor<B>>,
pub vm: Vec<Tensor<B>>,
pub t: usize,
}
Expand description
§Adaptive momentum estimation optimizer
Fields§
§learning_rate: f32
learning rate (default: 1e-3)
betas: (f32, f32)
coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))
eps: f32
term added to the denominator to improve numerical stability (default: 1e-8)
weight_decay: f32
weight decay (L2 penalty) (default: 0)
amsgrad: bool
whether to use the AMSGrad variant of this algorithm from the paper On the Convergence of Adam and Beyond (default: false)
maximize: bool
maximize the objective with respect to the params, instead of minimizing (default: false)
m: Vec<Tensor<B>>
m
v: Vec<Tensor<B>>
v
vm: Vec<Tensor<B>>
vm
t: usize
t
Implementations§
source§impl<B: Backend> Adam<B>
impl<B: Backend> Adam<B>
sourcepub fn update<'a>(
&mut self,
parameters: impl IntoIterator<Item = &'a mut Tensor<B>>,
gradients: impl IntoIterator<Item = Option<Tensor<B>>>
)where
B: 'a,
pub fn update<'a>(
&mut self,
parameters: impl IntoIterator<Item = &'a mut Tensor<B>>,
gradients: impl IntoIterator<Item = Option<Tensor<B>>>
)where
B: 'a,
Updates parameters with gradients. Number of parameters must be the same as number of gradients. Gradients can be None, those are simply skipped.
Trait Implementations§
Auto Trait Implementations§
impl<B> Freeze for Adam<B>
impl<B> RefUnwindSafe for Adam<B>where
B: RefUnwindSafe,
impl<B> Send for Adam<B>where
B: Send,
impl<B> Sync for Adam<B>where
B: Sync,
impl<B> Unpin for Adam<B>where
B: Unpin,
impl<B> UnwindSafe for Adam<B>where
B: UnwindSafe,
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more