Expand description
Optimization algorithms for gradient-based learning.
This module provides both stochastic (mini-batch) and batch (deterministic) optimizers
following the unified Optimizer trait architecture.
§Available Optimizers
§Stochastic (Mini-Batch) Optimizers
SGD- Stochastic Gradient Descent with optional momentumAdam- Adaptive Moment Estimation (adaptive learning rates)
§Batch (Deterministic) Optimizers
LBFGS- Limited-memory BFGS (memory-efficient quasi-Newton)ConjugateGradient- Conjugate Gradient with three beta formulasDampedNewton- Newton’s method with automatic damping for stability
§Convex Optimization (Phase 2)
FISTA- Fast Iterative Shrinkage-Thresholding (proximal gradient)CoordinateDescent- Coordinate-wise optimization for high dimensionsADMM- Alternating Direction Method of Multipliers (distributed ML)
§Constrained Optimization (Phase 3)
ProjectedGradientDescent- Projection onto convex setsAugmentedLagrangian- Equality-constrained optimizationInteriorPoint- Inequality-constrained optimization (log-barrier)
§Line Search Strategies
BacktrackingLineSearch- Simple Armijo condition (sufficient decrease)WolfeLineSearch- Armijo + curvature conditions (for quasi-Newton methods)
§Utility Functions
safe_cholesky_solve- Cholesky solver with automatic regularization
§Stochastic Optimization (Mini-Batch)
Stochastic optimizers update parameters incrementally using mini-batch gradients.
Use the Optimizer::step method for parameter updates:
use aprender::optim::SGD;
use aprender::primitives::Vector;
// Create optimizer with learning rate 0.01
let mut optimizer = SGD::new(0.01);
// Initialize parameters and gradients
let mut params = Vector::from_slice(&[1.0, 2.0, 3.0]);
let gradients = Vector::from_slice(&[0.1, 0.2, 0.3]);
// Update parameters
optimizer.step(&mut params, &gradients);
// Parameters are updated: params = params - lr * gradients
assert!((params[0] - 0.999).abs() < 1e-6);§Batch Optimization (Full Dataset)
Batch optimizers minimize objective functions using full dataset access.
They use the minimize method which returns detailed convergence information:
use aprender::optim::{LBFGS, ConvergenceStatus, Optimizer};
use aprender::primitives::Vector;
// Create L-BFGS optimizer: 100 max iterations, 1e-5 tolerance, 10 memory size
let mut optimizer = LBFGS::new(100, 1e-5, 10);
// Define objective and gradient functions
let objective = |x: &Vector<f32>| (x[0] - 5.0).powi(2) + (x[1] - 3.0).powi(2);
let gradient = |x: &Vector<f32>| {
Vector::from_slice(&[2.0 * (x[0] - 5.0), 2.0 * (x[1] - 3.0)])
};
let x0 = Vector::from_slice(&[0.0, 0.0]);
let result = optimizer.minimize(objective, gradient, x0);
assert_eq!(result.status, ConvergenceStatus::Converged);
assert!((result.solution[0] - 5.0).abs() < 1e-4);
assert!((result.solution[1] - 3.0).abs() < 1e-4);§See Also
examples/batch_optimization.rs- Comprehensive examples- Specification:
docs/specifications/comprehensive-optimization-spec.md
Modules§
- prox
- Proximal operators for non-smooth regularization.
Structs§
- ADMM
- ADMM (Alternating Direction Method of Multipliers) for distributed and constrained optimization.
- Adam
- Adam (Adaptive Moment Estimation) optimizer.
- Augmented
Lagrangian - Augmented Lagrangian method for constrained optimization.
- Backtracking
Line Search - Backtracking line search with Armijo condition.
- Conjugate
Gradient - Nonlinear Conjugate Gradient (CG) optimizer.
- Coordinate
Descent - Coordinate Descent optimizer for high-dimensional problems.
- Damped
Newton - Damped Newton optimizer with finite-difference Hessian approximation.
- FISTA
- FISTA (Fast Iterative Shrinkage-Thresholding Algorithm).
- Interior
Point - Interior Point (Barrier) method for inequality-constrained optimization.
- LBFGS
- Limited-memory BFGS (L-BFGS) optimizer.
- Optimization
Result - Result of an optimization procedure.
- Projected
Gradient Descent - Projected Gradient Descent for constrained optimization.
- SGD
- Stochastic Gradient Descent (SGD) optimizer with optional momentum.
- Wolfe
Line Search - Wolfe line search with Armijo and curvature conditions.
Enums§
- CGBeta
Formula - Beta computation formula for Conjugate Gradient.
- Convergence
Status - Convergence status of an optimization procedure.
Traits§
- Line
Search - Trait for line search strategies.
- Optimizer
- Unified trait for both stochastic and batch optimizers.
Functions§
- safe_
cholesky_ solve - Safely solves a linear system Ax = b using Cholesky decomposition with automatic regularization.