Struct multistochgrad::scsg::StochasticControlledGradientDescent

source ·

pub struct StochasticControlledGradientDescent { /* private fields */ }

Expand description

Provides Stochastic Controlled Gradient Descent optimization based on 2 papers of Lei-Jordan.

"On the adaptativity of stochastic gradient based optimisation" arxiv 2019,2020 SCSG-1
"Less than a single pass : stochastically controlled stochastic gradient" arxiv 2019 SCSG-2

According to the first paper we have the following notations:

One iteration j consists in :

a large batch of size Bⱼ
a number noted mⱼ of small batches of size bⱼ
update position with a step ηⱼ. The number of mini batch is described by a random variable with a geometric law.

The paper establishes rates of convergence depending on the ratio mⱼ/Bⱼ , bⱼ/mⱼ and ηⱼ/bⱼ and their products.

The second paper :
“Less than a single pass : stochastically controlled stochastic gradient”
describes a simplified version where the mini batches consist in just one term and the number of mini batch is set to the mean of the geometric variable corresponding to number of mini batches.

We adopt a mix of the two papers:

Letting the size of mini batch grow a little seems more stable than keeping it to 1. (in particular when initialization of the algorithm varies.) but replacing the geometric law by its mean is really more stable due to the large variance of its law.
We choose a fraction of the number of terms in the sum (large_batch_fraction_init) and alfa so that large_batch_fraction_init * alfa^(2*nbiter) = 1.

Then if nbterms is the number of terms in function to minimize and j the iteration number:

Bⱼ evolves as : large_batch_fraction_init * nbterms * alfa^(2j)
mⱼ evolves as : m_zero * nbterms * alfa^(3j/2)
bⱼ evolves as : b_0 * alfa^j
ηⱼ evolves as : eta_0 / alfa^(j/2)

The evolution of Bⱼ is bounded above by nbterms/10 (can be modified with Self::set_large_batch_max_fraction()) and bⱼ by nbterms/100.
The size of small batch must stay small so b₀ must be small (typically 1 seems OK)

Struct multistochgrad::scsg::StochasticControlledGradientDescent

Implementations§

impl StochasticControlledGradientDescent

pub fn new( eta_zero: f64, m_zero: f64, mini_batch_size_init: usize, large_batch_fraction_init: f64 ) -> StochasticControlledGradientDescent

Examples found in repository?

pub fn seed(&mut self, seed: [u8; 32])

pub fn set_large_batch_max_fraction(&mut self, fraction: f64)

Trait Implementations§

impl Default for StochasticControlledGradientDescent

fn default() -> Self

impl<D: Dimension, F: SummationC1<D>> Minimizer<D, F, usize> for StochasticControlledGradientDescent

type Solution = Solution<D>

fn minimize( &self, function: &F, initial_position: &Array<f64, D>, max_iterations: Option<usize> ) -> Solution<D>

Auto Trait Implementations§

impl RefUnwindSafe for StochasticControlledGradientDescent

impl Send for StochasticControlledGradientDescent

impl Sync for StochasticControlledGradientDescent

impl Unpin for StochasticControlledGradientDescent

impl UnwindSafe for StochasticControlledGradientDescent

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> Pointable for T

const ALIGN: usize = _

type Init = T

unsafe fn init(init: <T as Pointable>::Init) -> usize

unsafe fn deref<'a>(ptr: usize) -> &'a T

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

unsafe fn drop(ptr: usize)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<V, T> VZip<V> for Twhere V: MultiLane<T>,

fn vzip(self) -> V

Examples found in repository ?

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

impl<V, T> VZip<V> for T
where V: MultiLane<T>,