pub struct LbfgsbConfig {
pub memory_size: usize,
pub obj_tol: f64,
pub step_size_tol: f64,
pub c1: f64,
pub c2: f64,
pub fd_epsilon: f64,
pub fd_min_step: f64,
pub initial_step: f64,
pub max_line_search_iters: usize,
pub boundary_tol: f64,
}Expand description
Configuration parameters for L-BFGS-B optimization.
This struct contains all parameters that control the behavior of the L-BFGS-B algorithm. All parameters have sensible defaults suitable for most optimization problems.
§Core Parameters
The most important parameters for typical usage are:
memory_size: Controls memory usage vs convergence speed trade-offobj_tolandstep_size_tol: Control convergence criteriac1andc2: Control line search behavior
§Numerical Parameters
Parameters related to finite difference gradients and numerical stability:
fd_epsilonandfd_min_step: Control gradient approximation accuracyboundary_tol: Controls handling of bound constraints
§Example
use cmaes_lbfgsb::lbfgsb_optimize::LbfgsbConfig;
// Basic configuration (often sufficient)
let config = LbfgsbConfig::default();
// High-precision configuration
let precise_config = LbfgsbConfig {
memory_size: 20,
obj_tol: 1e-12,
step_size_tol: 1e-12,
..Default::default()
};
// Configuration for noisy functions
let robust_config = LbfgsbConfig {
c1: 1e-3,
c2: 0.8,
fd_epsilon: 1e-6,
max_line_search_iters: 30,
..Default::default()
};Fields§
§memory_size: usizeMemory size for L-BFGS (number of past gradient vectors to store).
Default: 5
Typical range: 3-20
Trade-offs:
- Larger values: Better approximation of Hessian, faster convergence, more memory
- Smaller values: Less memory usage, more robust to non-quadratic functions
Guidelines:
- Small problems (< 100 parameters): 5-10 is usually sufficient
- Large problems (> 1000 parameters): 10-20 can help convergence
- Noisy functions: Use smaller values (3-7) for more robustness
- Very smooth functions: Can benefit from larger values (15-20)
Memory usage: Each vector stored uses O(n) memory where n is problem dimension.
obj_tol: f64Tolerance for relative function improvement (convergence criterion).
Default: 1e-8
Typical range: 1e-12 to 1e-4
The algorithm terminates when the relative change in objective value falls below this threshold:
|f_old - f_new| / max(|f_old|, |f_new|, 1.0) < obj_tol
Guidelines:
- High precision needed: Use 1e-12 to 1e-10
- Standard precision: Use 1e-8 to 1e-6
- Fast approximate solutions: Use 1e-4 to 1e-2
- Noisy functions: Use larger values to avoid premature termination
step_size_tol: f64Tolerance for step size norm (convergence criterion).
Default: 1e-9
Typical range: 1e-12 to 1e-6
The algorithm terminates when ||step|| < step_size_tol, indicating that
parameter changes have become negligibly small.
Guidelines:
- Should typically be smaller than
obj_tol - For parameters with scale ~1: Use default value
- For very small parameters: Scale proportionally
- For very large parameters: May need to increase
c1: f64First Wolfe condition parameter (sufficient decrease, Armijo condition).
Default: 1e-4
Typical range: 1e-5 to 1e-2
Controls the required decrease in objective function for accepting a step.
The condition is: f(x + α*d) ≤ f(x) + c1*α*∇f(x)ᵀd
Trade-offs:
- Smaller values: More stringent decrease requirement, shorter steps, more stable
- Larger values: Less stringent requirement, longer steps, faster progress
Guidelines:
- Well-conditioned problems: Can use larger values (1e-3 to 1e-2)
- Ill-conditioned problems: Use smaller values (1e-5 to 1e-4)
- Noisy functions: Use smaller values for stability
Must satisfy: 0 < c1 < c2 < 1
c2: f64Second Wolfe condition parameter (curvature condition).
Default: 0.9
Typical range: 0.1 to 0.9
Controls the required change in gradient for accepting a step.
The condition is: |∇f(x + α*d)ᵀd| ≤ c2*|∇f(x)ᵀd|
Trade-offs:
- Smaller values: More stringent curvature requirement, shorter steps
- Larger values: Less stringent requirement, longer steps, fewer line search iterations
Guidelines:
- Newton-like methods: Use large values (0.9) to allow long steps
- Gradient descent-like: Use smaller values (0.1-0.5) for more careful steps
- Default 0.9: Good for L-BFGS as it allows the algorithm to take longer steps
Must satisfy: 0 < c1 < c2 < 1
fd_epsilon: f64Base step size for finite difference gradient estimation.
Default: 1e-8
Typical range: 1e-12 to 1e-4
The actual step size used is max(fd_epsilon * |x_i|, fd_min_step) for each parameter.
This provides relative scaling for different parameter magnitudes.
Trade-offs:
- Smaller values: More accurate gradients, but risk of numerical cancellation
- Larger values: Less accurate gradients, but more robust to noise
Guidelines:
- Smooth functions: Can use smaller values (1e-10 to 1e-8)
- Noisy functions: Use larger values (1e-6 to 1e-4)
- Mixed scales: Ensure
fd_min_stephandles small parameters appropriately
fd_min_step: f64Minimum step size for finite difference gradient estimation.
Default: 1e-12
Typical range: 1e-15 to 1e-8
Ensures that finite difference steps don’t become too small for parameters near zero, which would lead to poor gradient estimates.
Guidelines:
- Should be much smaller than typical parameter values
- Consider the scale of your smallest meaningful parameter changes
- Too small: Risk numerical precision issues
- Too large: Poor gradient estimates for small parameters
initial_step: f64Initial step size for line search.
Default: 1.0
Typical range: 0.1 to 10.0
The line search starts with this step size and adjusts based on the Wolfe conditions. For L-BFGS, starting with 1.0 often works well as the algorithm approximates Newton steps.
Guidelines:
- Well-conditioned problems: 1.0 is usually optimal
- Ill-conditioned problems: May benefit from smaller initial steps (0.1-0.5)
- Functions with large gradients: Consider smaller values
- Functions with small gradients: Consider larger values
max_line_search_iters: usizeMaximum number of line search iterations per optimization step.
Default: 20
Typical range: 10-50
Controls how much effort is spent finding a good step size. If the maximum is reached, the algorithm takes the best step found so far.
Trade-offs:
- Larger values: More accurate line search, potentially faster overall convergence
- Smaller values: Less time per iteration, may need more iterations overall
Guidelines:
- Smooth functions: 10-20 iterations usually sufficient
- Difficult functions: May need 30-50 iterations
- Time-critical applications: Use smaller values (5-10)
boundary_tol: f64Tolerance for gradient projection to zero at boundaries.
Default: 1e-14
Typical range: 1e-16 to 1e-10
When a parameter is at a bound and the gradient would push it further beyond the bound, the gradient component is projected to zero. This tolerance determines when a parameter is considered “at” a bound.
Guidelines:
- Should be much smaller than the expected precision of your solution
- Too small: Parameters may never be considered exactly at bounds
- Too large: May incorrectly project gradients for parameters near bounds
- Consider the scale of your parameter bounds when setting this
Trait Implementations§
Source§impl Default for LbfgsbConfig
impl Default for LbfgsbConfig
Source§fn default() -> LbfgsbConfig
fn default() -> LbfgsbConfig
Auto Trait Implementations§
impl Freeze for LbfgsbConfig
impl RefUnwindSafe for LbfgsbConfig
impl Send for LbfgsbConfig
impl Sync for LbfgsbConfig
impl Unpin for LbfgsbConfig
impl UnwindSafe for LbfgsbConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.