LbfgsbConfig

Struct LbfgsbConfig 

Source
pub struct LbfgsbConfig {
    pub memory_size: usize,
    pub obj_tol: f64,
    pub step_size_tol: f64,
    pub c1: f64,
    pub c2: f64,
    pub fd_epsilon: f64,
    pub fd_min_step: f64,
    pub initial_step: f64,
    pub max_line_search_iters: usize,
    pub boundary_tol: f64,
}
Expand description

Configuration parameters for L-BFGS-B optimization.

This struct contains all parameters that control the behavior of the L-BFGS-B algorithm. All parameters have sensible defaults suitable for most optimization problems.

§Core Parameters

The most important parameters for typical usage are:

  • memory_size: Controls memory usage vs convergence speed trade-off
  • obj_tol and step_size_tol: Control convergence criteria
  • c1 and c2: Control line search behavior

§Numerical Parameters

Parameters related to finite difference gradients and numerical stability:

  • fd_epsilon and fd_min_step: Control gradient approximation accuracy
  • boundary_tol: Controls handling of bound constraints

§Example

use cmaes_lbfgsb::lbfgsb_optimize::LbfgsbConfig;
 
// Basic configuration (often sufficient)
let config = LbfgsbConfig::default();
 
// High-precision configuration
let precise_config = LbfgsbConfig {
    memory_size: 20,
    obj_tol: 1e-12,
    step_size_tol: 1e-12,
    ..Default::default()
};
 
// Configuration for noisy functions
let robust_config = LbfgsbConfig {
    c1: 1e-3,
    c2: 0.8,
    fd_epsilon: 1e-6,
    max_line_search_iters: 30,
    ..Default::default()
};

Fields§

§memory_size: usize

Memory size for L-BFGS (number of past gradient vectors to store).

Default: 5

Typical range: 3-20

Trade-offs:

  • Larger values: Better approximation of Hessian, faster convergence, more memory
  • Smaller values: Less memory usage, more robust to non-quadratic functions

Guidelines:

  • Small problems (< 100 parameters): 5-10 is usually sufficient
  • Large problems (> 1000 parameters): 10-20 can help convergence
  • Noisy functions: Use smaller values (3-7) for more robustness
  • Very smooth functions: Can benefit from larger values (15-20)

Memory usage: Each vector stored uses O(n) memory where n is problem dimension.

§obj_tol: f64

Tolerance for relative function improvement (convergence criterion).

Default: 1e-8

Typical range: 1e-12 to 1e-4

The algorithm terminates when the relative change in objective value falls below this threshold: |f_old - f_new| / max(|f_old|, |f_new|, 1.0) < obj_tol

Guidelines:

  • High precision needed: Use 1e-12 to 1e-10
  • Standard precision: Use 1e-8 to 1e-6
  • Fast approximate solutions: Use 1e-4 to 1e-2
  • Noisy functions: Use larger values to avoid premature termination
§step_size_tol: f64

Tolerance for step size norm (convergence criterion).

Default: 1e-9

Typical range: 1e-12 to 1e-6

The algorithm terminates when ||step|| < step_size_tol, indicating that parameter changes have become negligibly small.

Guidelines:

  • Should typically be smaller than obj_tol
  • For parameters with scale ~1: Use default value
  • For very small parameters: Scale proportionally
  • For very large parameters: May need to increase
§c1: f64

First Wolfe condition parameter (sufficient decrease, Armijo condition).

Default: 1e-4

Typical range: 1e-5 to 1e-2

Controls the required decrease in objective function for accepting a step. The condition is: f(x + α*d) ≤ f(x) + c1*α*∇f(x)ᵀd

Trade-offs:

  • Smaller values: More stringent decrease requirement, shorter steps, more stable
  • Larger values: Less stringent requirement, longer steps, faster progress

Guidelines:

  • Well-conditioned problems: Can use larger values (1e-3 to 1e-2)
  • Ill-conditioned problems: Use smaller values (1e-5 to 1e-4)
  • Noisy functions: Use smaller values for stability

Must satisfy: 0 < c1 < c2 < 1

§c2: f64

Second Wolfe condition parameter (curvature condition).

Default: 0.9

Typical range: 0.1 to 0.9

Controls the required change in gradient for accepting a step. The condition is: |∇f(x + α*d)ᵀd| ≤ c2*|∇f(x)ᵀd|

Trade-offs:

  • Smaller values: More stringent curvature requirement, shorter steps
  • Larger values: Less stringent requirement, longer steps, fewer line search iterations

Guidelines:

  • Newton-like methods: Use large values (0.9) to allow long steps
  • Gradient descent-like: Use smaller values (0.1-0.5) for more careful steps
  • Default 0.9: Good for L-BFGS as it allows the algorithm to take longer steps

Must satisfy: 0 < c1 < c2 < 1

§fd_epsilon: f64

Base step size for finite difference gradient estimation.

Default: 1e-8

Typical range: 1e-12 to 1e-4

The actual step size used is max(fd_epsilon * |x_i|, fd_min_step) for each parameter. This provides relative scaling for different parameter magnitudes.

Trade-offs:

  • Smaller values: More accurate gradients, but risk of numerical cancellation
  • Larger values: Less accurate gradients, but more robust to noise

Guidelines:

  • Smooth functions: Can use smaller values (1e-10 to 1e-8)
  • Noisy functions: Use larger values (1e-6 to 1e-4)
  • Mixed scales: Ensure fd_min_step handles small parameters appropriately
§fd_min_step: f64

Minimum step size for finite difference gradient estimation.

Default: 1e-12

Typical range: 1e-15 to 1e-8

Ensures that finite difference steps don’t become too small for parameters near zero, which would lead to poor gradient estimates.

Guidelines:

  • Should be much smaller than typical parameter values
  • Consider the scale of your smallest meaningful parameter changes
  • Too small: Risk numerical precision issues
  • Too large: Poor gradient estimates for small parameters
§initial_step: f64

Initial step size for line search.

Default: 1.0

Typical range: 0.1 to 10.0

The line search starts with this step size and adjusts based on the Wolfe conditions. For L-BFGS, starting with 1.0 often works well as the algorithm approximates Newton steps.

Guidelines:

  • Well-conditioned problems: 1.0 is usually optimal
  • Ill-conditioned problems: May benefit from smaller initial steps (0.1-0.5)
  • Functions with large gradients: Consider smaller values
  • Functions with small gradients: Consider larger values
§max_line_search_iters: usize

Maximum number of line search iterations per optimization step.

Default: 20

Typical range: 10-50

Controls how much effort is spent finding a good step size. If the maximum is reached, the algorithm takes the best step found so far.

Trade-offs:

  • Larger values: More accurate line search, potentially faster overall convergence
  • Smaller values: Less time per iteration, may need more iterations overall

Guidelines:

  • Smooth functions: 10-20 iterations usually sufficient
  • Difficult functions: May need 30-50 iterations
  • Time-critical applications: Use smaller values (5-10)
§boundary_tol: f64

Tolerance for gradient projection to zero at boundaries.

Default: 1e-14

Typical range: 1e-16 to 1e-10

When a parameter is at a bound and the gradient would push it further beyond the bound, the gradient component is projected to zero. This tolerance determines when a parameter is considered “at” a bound.

Guidelines:

  • Should be much smaller than the expected precision of your solution
  • Too small: Parameters may never be considered exactly at bounds
  • Too large: May incorrectly project gradients for parameters near bounds
  • Consider the scale of your parameter bounds when setting this

Trait Implementations§

Source§

impl Default for LbfgsbConfig

Source§

fn default() -> LbfgsbConfig

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<SS, SP> SupersetOf<SS> for SP
where SS: SubsetOf<SP>,

Source§

fn to_subset(&self) -> Option<SS>

The inverse inclusion map: attempts to construct self from the equivalent element of its superset. Read more
Source§

fn is_in_subset(&self) -> bool

Checks if self is actually part of its subset T (and can be converted to it).
Source§

fn to_subset_unchecked(&self) -> SS

Use with care! Same as self.to_subset but without any property checks. Always succeeds.
Source§

fn from_subset(element: &SS) -> SP

The inclusion map: converts self to the equivalent element of its superset.
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V