Struct multistochgrad::scsg::StochasticControlledGradientDescent[][src]

pub struct StochasticControlledGradientDescent { /* fields omitted */ }
Expand description

Provides Stochastic Controlled Gradient Descent optimization based on 2 papers of Lei-Jordan.

  • "On the adaptativity of stochastic gradient based optimisation" arxiv 2019 SCSG-1
  • "Less than a single pass : stochastically controlled stochastic gradient" arxiv 2019 SCSG-2

According to the first paper we have the following notations:

One iteration j consists in :

  • a large batch of size Bⱼ
  • a number noted mⱼ of small batches of size bⱼ
  • update position with a step ηⱼ. The number of mini batch is described by a random variable with a geometric law.

The paper establishes rates of convergence depending on the ratio mⱼ/Bⱼ , bⱼ/mⱼ and ηⱼ/bⱼ and their products.

The second paper :
“Less than a single pass : stochastically controlled stochastic gradient”
describes a simplified version where the mini batches consist in just one term and the number of mini batch is set to the mean of the geometric variable corresponding to number of mini batches.

We adopt a mix of the two papers: It seems that letting the size of mini batch grow a little is more stable than keeping it to 1. (in particular when initialization of the algorithm varies.) but replacing the geometric law by its mean is really more stable due to the large variance of its law.

If nbterms is the number of terms in function to minimize and j the iteration number:

  • Bⱼ evolves as : large_batch_size_init * nbterms * alfa^(2j)
  • mⱼ evolves as : m_zero * nbterms * alfa^(3j/2)
  • bⱼ evolves as : b_0 * alfa^j
  • ηⱼ evolves as : eta_0 / alfa^(j/2)

where alfa is computed to be slightly greater than 1.
In fact α is chosen so that : B_0 * alfa^(2*nbiter) = 1.

The evolution of Bⱼ is bounded above by nbterms/10 and bⱼ by nbterms/100.
The size of small batch must stay small so b₀ must be small (typically 1 seems OK)

Implementations

impl StochasticControlledGradientDescent[src]

pub fn new(
    eta_zero: f64,
    m_zero: f64,
    mini_batch_size_init: usize,
    large_batch_size_init: f64
) -> StochasticControlledGradientDescent
[src]

args are :

  • initial value of step along gradient value of 0.5 is a good default choice.
  • m_zero : a good value is 0.2 *large_batch_size_init so that mⱼ << Bⱼ
  • base value for size of mini_batchs : a value of 1 is a good default choice
  • fraction of nbterms to initialize large batch size : a good default value is between 0.01 and 0.02 large batch size begins at 0.01 * nbterms or 0.02 * nbterms.

(see examples)

pub fn seed(&mut self, seed: [u8; 32])[src]

Seeds the random number generator using the supplied seed. This is useful to create re-producable results.

Trait Implementations

impl<D: Dimension, F: SummationC1<D>> Minimizer<D, F, usize> for StochasticControlledGradientDescent[src]

type Solution = Solution<D>

Type of the solution the Minimizer returns.

fn minimize(
    &self,
    function: &F,
    initial_position: &Array<f64, D>,
    max_iterations: Option<usize>
) -> Solution<D>
[src]

Performs the actual minimization and returns a solution. MinimizerArg should provide a number of iterations, a min error , or anything needed for implemented algorithm Read more

Auto Trait Implementations

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

pub fn type_id(&self) -> TypeId[src]

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

pub fn borrow(&self) -> &T[src]

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

pub fn borrow_mut(&mut self) -> &mut T[src]

Mutably borrows from an owned value. Read more

impl<T> From<T> for T[src]

pub fn from(t: T) -> T[src]

Performs the conversion.

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

pub fn into(self) -> U[src]

Performs the conversion.

impl<T> Pointable for T

pub const ALIGN: usize

The alignment of pointer.

type Init = T

The type for initializers.

pub unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more

pub unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more

pub unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more

pub unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

pub fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>[src]

Performs the conversion.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

pub fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>[src]

Performs the conversion.

impl<V, T> VZip<V> for T where
    V: MultiLane<T>, 

pub fn vzip(self) -> V