Skip to main content

Module optim

Module optim 

Source
Expand description

Optimization algorithms for gradient-based learning.

This module provides both stochastic (mini-batch) and batch (deterministic) optimizers following the unified Optimizer trait architecture.

§Available Optimizers

§Stochastic (Mini-Batch) Optimizers

  • SGD - Stochastic Gradient Descent with optional momentum
  • Adam - Adaptive Moment Estimation (adaptive learning rates)

§Batch (Deterministic) Optimizers

  • LBFGS - Limited-memory BFGS (memory-efficient quasi-Newton)
  • ConjugateGradient - Conjugate Gradient with three beta formulas
  • DampedNewton - Newton’s method with automatic damping for stability

§Convex Optimization (Phase 2)

  • FISTA - Fast Iterative Shrinkage-Thresholding (proximal gradient)
  • CoordinateDescent - Coordinate-wise optimization for high dimensions
  • ADMM - Alternating Direction Method of Multipliers (distributed ML)

§Constrained Optimization (Phase 3)

§Line Search Strategies

§Utility Functions

§Stochastic Optimization (Mini-Batch)

Stochastic optimizers update parameters incrementally using mini-batch gradients. Use the Optimizer::step method for parameter updates:

use aprender::optim::SGD;
use aprender::primitives::Vector;

// Create optimizer with learning rate 0.01
let mut optimizer = SGD::new(0.01);

// Initialize parameters and gradients
let mut params = Vector::from_slice(&[1.0, 2.0, 3.0]);
let gradients = Vector::from_slice(&[0.1, 0.2, 0.3]);

// Update parameters
optimizer.step(&mut params, &gradients);

// Parameters are updated: params = params - lr * gradients
assert!((params[0] - 0.999).abs() < 1e-6);

§Batch Optimization (Full Dataset)

Batch optimizers minimize objective functions using full dataset access. They use the minimize method which returns detailed convergence information:

use aprender::optim::{LBFGS, ConvergenceStatus, Optimizer};
use aprender::primitives::Vector;

// Create L-BFGS optimizer: 100 max iterations, 1e-5 tolerance, 10 memory size
let mut optimizer = LBFGS::new(100, 1e-5, 10);

// Define objective and gradient functions
let objective = |x: &Vector<f32>| (x[0] - 5.0).powi(2) + (x[1] - 3.0).powi(2);
let gradient = |x: &Vector<f32>| {
    Vector::from_slice(&[2.0 * (x[0] - 5.0), 2.0 * (x[1] - 3.0)])
};

let x0 = Vector::from_slice(&[0.0, 0.0]);
let result = optimizer.minimize(objective, gradient, x0);

assert_eq!(result.status, ConvergenceStatus::Converged);
assert!((result.solution[0] - 5.0).abs() < 1e-4);
assert!((result.solution[1] - 3.0).abs() < 1e-4);

§See Also

Modules§

prox
Proximal operators for non-smooth regularization.

Structs§

ADMM
ADMM (Alternating Direction Method of Multipliers) for distributed and constrained optimization.
Adam
Adam (Adaptive Moment Estimation) optimizer.
AugmentedLagrangian
Augmented Lagrangian method for constrained optimization.
BacktrackingLineSearch
Backtracking line search with Armijo condition.
ConjugateGradient
Nonlinear Conjugate Gradient (CG) optimizer.
CoordinateDescent
Coordinate Descent optimizer for high-dimensional problems.
DampedNewton
Damped Newton optimizer with finite-difference Hessian approximation.
FISTA
FISTA (Fast Iterative Shrinkage-Thresholding Algorithm).
InteriorPoint
Interior Point (Barrier) method for inequality-constrained optimization.
LBFGS
Limited-memory BFGS (L-BFGS) optimizer.
OptimizationResult
Result of an optimization procedure.
ProjectedGradientDescent
Projected Gradient Descent for constrained optimization.
SGD
Stochastic Gradient Descent (SGD) optimizer with optional momentum.
WolfeLineSearch
Wolfe line search with Armijo and curvature conditions.

Enums§

CGBetaFormula
Beta computation formula for Conjugate Gradient.
ConvergenceStatus
Convergence status of an optimization procedure.

Traits§

LineSearch
Trait for line search strategies.
Optimizer
Unified trait for both stochastic and batch optimizers.

Functions§

safe_cholesky_solve
Safely solves a linear system Ax = b using Cholesky decomposition with automatic regularization.