Struct vikos::teacher::Nesterov
[−]
[src]
pub struct Nesterov { pub l0: f64, pub t: f64, pub inertia: f64, }
Nesterov accelerated gradient descent
Like accelerated gradient descent, Nesterov accelerated gradient descent includes a momentum term. In contrast to regular gradient descent, the acceleration is not calculated with respect to the current position, but to the estimated new one. Source: G. Hinton's lecture 6c
Fields
l0: f64
Start learning rate
t: f64
Smaller t will decrease the learning rate faster
After t events the start learning rate will be a half l0
,
after two t events the learning rate will be one third l0
,
and so on.
inertia: f64
To simulate friction, please select a value smaller than 1 (recommended)
Trait Implementations
impl<M> Teacher<M> for Nesterov where M: Model
[src]
type Training = (usize, Vec<f64>)
Contains state which changes during the training, but is not part of the expertise Read more
fn new_training(&self, model: &M) -> (usize, Vec<f64>)
Creates an instance holding all mutable state of the algorithm
fn teach_event<Y, C>(&self, training: &mut (usize, Vec<f64>), model: &mut M, cost: &C, features: &M::Input, truth: Y) where C: Cost<Y>, Y: Copy
Changes model
s coefficents so they minimize the cost
function (hopefully)