pub struct LBFGS<L, P, G, F> { /* private fields */ }
Expand description
Limited-memory BFGS (L-BFGS) method
L-BFGS is an approximation to BFGS which requires a limited amount of memory. Instead of storing the inverse, only a few vectors which implicitly represent the inverse matrix are stored.
It requires a line search and the number of vectors to be stored (history size m
) must be
set. Additionally an initial guess for the parameter vector is required, which is to be
provided via the configure
method of the
Executor
(See IterState
, in particular IterState::param
).
In the same way the initial gradient and cost function corresponding to the initial parameter
vector can be provided. If these are not provided, they will be computed during initialization
of the algorithm.
Two tolerances can be configured, which are both needed in the stopping criteria.
One is a tolerance on the gradient (set with
with_tolerance_grad
): If the norm of the gradient is below
said tolerance, the algorithm stops. It defaults to sqrt(EPSILON)
.
The other one is a tolerance on the change of the cost function from one iteration to the
other. If the change is below this tolerance (default: EPSILON
), the algorithm stops. This
parameter can be set via with_tolerance_cost
.
Orthant-Wise Limited-memory Quasi-Newton (OWL-QN) method
OWL-QN is a method that adapts L-BFGS to L1-regularization. The original L-BFGS requires a
loss function to be differentiable and does not support L1-regularization. Therefore,
this library switches to OWL-QN when L1-regularization is specified. L1-regularization can be
performed via with_l1_regularization
.
TODO: Implement compact representation of BFGS updating (Nocedal/Wright p.230)
Requirements on the optimization problem
The optimization problem is required to implement CostFunction
and Gradient
.
Reference
Jorge Nocedal and Stephen J. Wright (2006). Numerical Optimization. Springer. ISBN 0-387-30303-0.
Galen Andrew and Jianfeng Gao (2007). Scalable Training of L1-Regularized Log-Linear Models, International Conference on Machine Learning.
Implementations
sourceimpl<L, P, G, F> LBFGS<L, P, G, F> where
F: ArgminFloat,
impl<L, P, G, F> LBFGS<L, P, G, F> where
F: ArgminFloat,
sourcepub fn with_tolerance_grad(self, tol_grad: F) -> Result<Self, Error>
pub fn with_tolerance_grad(self, tol_grad: F) -> Result<Self, Error>
The algorithm stops if the norm of the gradient is below tol_grad
.
The provided value must be non-negative. Defaults to sqrt(EPSILON)
.
Example
let lbfgs: LBFGS<_, Vec<f64>, Vec<f64>, f64> = LBFGS::new(linesearch, 3).with_tolerance_grad(1e-6)?;
sourcepub fn with_tolerance_cost(self, tol_cost: F) -> Result<Self, Error>
pub fn with_tolerance_cost(self, tol_cost: F) -> Result<Self, Error>
Sets tolerance for the stopping criterion based on the change of the cost stopping criterion
The provided value must be non-negative. Defaults to EPSILON
.
Example
let lbfgs: LBFGS<_, Vec<f64>, Vec<f64>, f64> = LBFGS::new(linesearch, 3).with_tolerance_cost(1e-6)?;
sourcepub fn with_l1_regularization(self, l1_coeff: F) -> Result<Self, Error>
pub fn with_l1_regularization(self, l1_coeff: F) -> Result<Self, Error>
Activates L1-regularization with coefficient l1_coeff
.
Parameter l1_coeff
must be > 0.0
.
Example
let lbfgs: LBFGS<_, Vec<f64>, Vec<f64>, f64> = LBFGS::new(linesearch, 3).with_l1_regularization(1.0)?;
Trait Implementations
sourceimpl<'de, L, P, G, F> Deserialize<'de> for LBFGS<L, P, G, F> where
L: Deserialize<'de>,
P: Deserialize<'de>,
G: Deserialize<'de>,
F: Deserialize<'de>,
impl<'de, L, P, G, F> Deserialize<'de> for LBFGS<L, P, G, F> where
L: Deserialize<'de>,
P: Deserialize<'de>,
G: Deserialize<'de>,
F: Deserialize<'de>,
sourcefn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error> where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error> where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
sourceimpl<L, P, G, F> Serialize for LBFGS<L, P, G, F> where
L: Serialize,
P: Serialize,
G: Serialize,
F: Serialize,
impl<L, P, G, F> Serialize for LBFGS<L, P, G, F> where
L: Serialize,
P: Serialize,
G: Serialize,
F: Serialize,
sourceimpl<O, L, P, G, F> Solver<O, IterState<P, G, (), (), F>> for LBFGS<L, P, G, F> where
O: CostFunction<Param = P, Output = F> + Gradient<Param = P, Gradient = G>,
P: Clone + Debug + SerializeAlias + DeserializeOwnedAlias + ArgminSub<P, P> + ArgminSub<F, P> + ArgminAdd<P, P> + ArgminAdd<F, P> + ArgminDot<G, F> + ArgminMul<F, P> + ArgminMul<P, P> + ArgminMul<G, P> + ArgminL1Norm<F> + ArgminSignum + ArgminZeroLike + ArgminMinMax,
G: Clone + Debug + SerializeAlias + DeserializeOwnedAlias + ArgminL2Norm<F> + ArgminSub<G, G> + ArgminAdd<G, G> + ArgminAdd<P, G> + ArgminDot<G, F> + ArgminDot<P, F> + ArgminMul<F, G> + ArgminMul<F, P> + ArgminZeroLike + ArgminMinMax,
L: Clone + LineSearch<P, F> + Solver<LineSearchProblem<O, P, G, F>, IterState<P, G, (), (), F>>,
F: ArgminFloat,
impl<O, L, P, G, F> Solver<O, IterState<P, G, (), (), F>> for LBFGS<L, P, G, F> where
O: CostFunction<Param = P, Output = F> + Gradient<Param = P, Gradient = G>,
P: Clone + Debug + SerializeAlias + DeserializeOwnedAlias + ArgminSub<P, P> + ArgminSub<F, P> + ArgminAdd<P, P> + ArgminAdd<F, P> + ArgminDot<G, F> + ArgminMul<F, P> + ArgminMul<P, P> + ArgminMul<G, P> + ArgminL1Norm<F> + ArgminSignum + ArgminZeroLike + ArgminMinMax,
G: Clone + Debug + SerializeAlias + DeserializeOwnedAlias + ArgminL2Norm<F> + ArgminSub<G, G> + ArgminAdd<G, G> + ArgminAdd<P, G> + ArgminDot<G, F> + ArgminDot<P, F> + ArgminMul<F, G> + ArgminMul<F, P> + ArgminZeroLike + ArgminMinMax,
L: Clone + LineSearch<P, F> + Solver<LineSearchProblem<O, P, G, F>, IterState<P, G, (), (), F>>,
F: ArgminFloat,
sourcefn init(
&mut self,
problem: &mut Problem<O>,
state: IterState<P, G, (), (), F>
) -> Result<(IterState<P, G, (), (), F>, Option<KV>), Error>
fn init(
&mut self,
problem: &mut Problem<O>,
state: IterState<P, G, (), (), F>
) -> Result<(IterState<P, G, (), (), F>, Option<KV>), Error>
Initializes the algorithm. Read more
sourcefn next_iter(
&mut self,
problem: &mut Problem<O>,
state: IterState<P, G, (), (), F>
) -> Result<(IterState<P, G, (), (), F>, Option<KV>), Error>
fn next_iter(
&mut self,
problem: &mut Problem<O>,
state: IterState<P, G, (), (), F>
) -> Result<(IterState<P, G, (), (), F>, Option<KV>), Error>
sourcefn terminate(&mut self, state: &IterState<P, G, (), (), F>) -> TerminationReason
fn terminate(&mut self, state: &IterState<P, G, (), (), F>) -> TerminationReason
Used to implement stopping criteria, in particular criteria which are not covered by
(terminate_internal
. Read more
sourcefn terminate_internal(&mut self, state: &I) -> TerminationReason
fn terminate_internal(&mut self, state: &I) -> TerminationReason
Checks whether basic termination reasons apply. Read more
Auto Trait Implementations
impl<L, P, G, F> RefUnwindSafe for LBFGS<L, P, G, F> where
F: RefUnwindSafe,
G: RefUnwindSafe,
L: RefUnwindSafe,
P: RefUnwindSafe,
impl<L, P, G, F> Send for LBFGS<L, P, G, F> where
F: Send,
G: Send,
L: Send,
P: Send,
impl<L, P, G, F> Sync for LBFGS<L, P, G, F> where
F: Sync,
G: Sync,
L: Sync,
P: Sync,
impl<L, P, G, F> Unpin for LBFGS<L, P, G, F> where
F: Unpin,
G: Unpin,
L: Unpin,
P: Unpin,
impl<L, P, G, F> UnwindSafe for LBFGS<L, P, G, F> where
F: UnwindSafe,
G: UnwindSafe,
L: UnwindSafe,
P: UnwindSafe,
Blanket Implementations
sourceimpl<T> BorrowMut<T> for T where
T: ?Sized,
impl<T> BorrowMut<T> for T where
T: ?Sized,
const: unstable · sourcefn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more