pub struct Nesterov {
pub l0: f64,
pub t: f64,
pub inertia: f64,
}Expand description
Nesterov accelerated gradient descent
Like accelerated gradient descent, Nesterov accelerated gradient descent includes a momentum term. In contrast to regular gradient descent, the acceleration is not calculated with respect to the current position, but to the estimated new one. Source: [G. Hinton’s lecture 6c] (http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf)
Fields§
§l0: f64Start learning rate
t: f64Smaller t will decrease the learning rate faster
After t events the start learning rate will be a half l0, after two t events the learning
rate will be one third l0, and so on.
inertia: f64To simulate friction, please select a value smaller than 1 (recommended)
Trait Implementations§
Source§impl<'de> Deserialize<'de> for Nesterov
impl<'de> Deserialize<'de> for Nesterov
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Auto Trait Implementations§
impl Freeze for Nesterov
impl RefUnwindSafe for Nesterov
impl Send for Nesterov
impl Sync for Nesterov
impl Unpin for Nesterov
impl UnwindSafe for Nesterov
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more