apex_solver/optimizer/levenberg_marquardt.rs
1//! Levenberg-Marquardt algorithm implementation.
2//!
3//! The Levenberg-Marquardt (LM) method is a robust and widely-used algorithm for solving
4//! nonlinear least squares problems of the form:
5//!
6//! ```text
7//! min f(x) = ½||r(x)||² = ½Σᵢ rᵢ(x)²
8//! ```
9//!
10//! where `r: ℝⁿ → ℝᵐ` is the residual vector function.
11//!
12//! # Algorithm Overview
13//!
14//! The Levenberg-Marquardt method solves the damped normal equations at each iteration:
15//!
16//! ```text
17//! (J^T·J + λI)·h = -J^T·r
18//! ```
19//!
20//! where:
21//! - `J` is the Jacobian matrix (m × n)
22//! - `r` is the residual vector (m × 1)
23//! - `h` is the step vector (n × 1)
24//! - `λ` is the adaptive damping parameter (scalar)
25//! - `I` is the identity matrix (or diagonal scaling matrix)
26//!
27//! ## Damping Parameter Strategy
28//!
29//! The damping parameter λ adapts based on step quality:
30//!
31//! - **λ → 0** (small damping): Behaves like Gauss-Newton with fast quadratic convergence
32//! - **λ → ∞** (large damping): Behaves like gradient descent with guaranteed descent direction
33//!
34//! This interpolation between Newton and gradient descent provides excellent robustness
35//! while maintaining fast convergence near the solution.
36//!
37//! ## Step Acceptance and Damping Update
38//!
39//! The algorithm evaluates each proposed step using the gain ratio:
40//!
41//! ```text
42//! ρ = (actual reduction) / (predicted reduction)
43//! = [f(xₖ) - f(xₖ + h)] / [f(xₖ) - L(h)]
44//! ```
45//!
46//! where `L(h) = f(xₖ) + h^T·g + ½h^T·H·h` is the local quadratic model.
47//!
48//! **Step acceptance:**
49//! - If `ρ > 0`: Accept step (cost decreased), decrease λ to trust the model more
50//! - If `ρ ≤ 0`: Reject step (cost increased), increase λ to be more conservative
51//!
52//! **Damping update** (Nielsen's formula):
53//! ```text
54//! λₖ₊₁ = λₖ · max(1/3, 1 - (2ρ - 1)³)
55//! ```
56//!
57//! This provides smooth, data-driven adaptation of the damping parameter.
58//!
59//! ## Convergence Properties
60//!
61//! - **Global convergence**: Guaranteed to find a stationary point from any starting guess
62//! - **Local quadratic convergence**: Near the solution, behaves like Gauss-Newton
63//! - **Robust to poor initialization**: Adaptive damping prevents divergence
64//! - **Handles ill-conditioning**: Large λ stabilizes nearly singular Hessian
65//!
66//! ## When to Use
67//!
68//! Levenberg-Marquardt is the best general-purpose choice when:
69//! - Initial parameter guess may be far from the optimum
70//! - Problem conditioning is unknown
71//! - Robustness is prioritized over raw speed
72//! - You want reliable convergence across diverse problem types
73//!
74//! For problems with specific structure, consider:
75//! - [`GaussNewton`](crate::GaussNewton) if well-conditioned with good initialization
76//! - [`DogLeg`](crate::DogLeg) for explicit trust region control
77//!
78//! # Implementation Features
79//!
80//! - **Sparse matrix support**: Efficient handling of large-scale problems via `faer` sparse library
81//! - **Adaptive damping**: Nielsen's formula for smooth parameter adaptation
82//! - **Robust linear solvers**: Cholesky (fast) or QR (stable) factorization
83//! - **Jacobi scaling**: Optional diagonal preconditioning for mixed-scale problems
84//! - **Covariance computation**: Optional uncertainty quantification after convergence
85//! - **Manifold operations**: Native support for optimization on Lie groups (SE2, SE3, SO2, SO3)
86//! - **Comprehensive diagnostics**: Detailed summaries of convergence and performance
87//!
88//! # Mathematical Background
89//!
90//! The augmented Hessian `J^T·J + λI` combines two beneficial properties:
91//!
92//! 1. **Positive definiteness**: Always solvable even when `J^T·J` is singular
93//! 2. **Regularization**: Prevents taking steps in poorly-determined directions
94//!
95//! The trust region interpretation: λ controls an implicit spherical trust region where
96//! larger λ restricts step size, ensuring the linear model remains valid.
97//!
98//! # Examples
99//!
100//! ## Basic usage
101//!
102//! ```no_run
103//! use apex_solver::LevenbergMarquardt;
104//! use apex_solver::core::problem::Problem;
105//! use std::collections::HashMap;
106//!
107//! # fn main() -> Result<(), Box<dyn std::error::Error>> {
108//! let mut problem = Problem::new();
109//! // ... add residual blocks (factors) to problem ...
110//!
111//! let initial_values = HashMap::new();
112//! // ... initialize parameters ...
113//!
114//! let mut solver = LevenbergMarquardt::new();
115//! let result = solver.optimize(&problem, &initial_values)?;
116//!
117//! # Ok(())
118//! # }
119//! ```
120//!
121//! ## Advanced configuration
122//!
123//! ```no_run
124//! use apex_solver::optimizer::levenberg_marquardt::{LevenbergMarquardtConfig, LevenbergMarquardt};
125//! use apex_solver::linalg::LinearSolverType;
126//!
127//! # fn main() {
128//! let config = LevenbergMarquardtConfig::new()
129//! .with_max_iterations(100)
130//! .with_cost_tolerance(1e-6)
131//! .with_damping(1e-3) // Initial damping
132//! .with_damping_bounds(1e-12, 1e12) // Min/max damping
133//! .with_jacobi_scaling(true); // Improve conditioning
134//!
135//! let mut solver = LevenbergMarquardt::with_config(config);
136//! # }
137//! ```
138//!
139//! # References
140//!
141//! - Levenberg, K. (1944). "A Method for the Solution of Certain Non-Linear Problems in Least Squares". *Quarterly of Applied Mathematics*.
142//! - Marquardt, D. W. (1963). "An Algorithm for Least-Squares Estimation of Nonlinear Parameters". *Journal of the Society for Industrial and Applied Mathematics*.
143//! - Madsen, K., Nielsen, H. B., & Tingleff, O. (2004). *Methods for Non-Linear Least Squares Problems* (2nd ed.). Chapter 3.
144//! - Nocedal, J. & Wright, S. (2006). *Numerical Optimization* (2nd ed.). Springer. Chapter 10.
145//! - Nielsen, H. B. (1999). "Damping Parameter in Marquardt's Method". Technical Report IMM-REP-1999-05.
146
147use crate::core::problem::{Problem, SymbolicStructure, VariableEnum};
148use crate::error;
149use crate::linalg::{LinearSolverType, SparseCholeskySolver, SparseLinearSolver, SparseQRSolver};
150use crate::manifold::ManifoldType;
151use crate::optimizer::{
152 ConvergenceInfo, OptObserverVec, OptimizationStatus, OptimizerError, Solver, SolverResult,
153 apply_negative_parameter_step, apply_parameter_step, compute_cost,
154};
155
156use faer::{
157 Mat,
158 sparse::{SparseColMat, Triplet},
159};
160use nalgebra::DVector;
161use std::collections::HashMap;
162use std::{
163 fmt,
164 fmt::{Display, Formatter},
165 time::{Duration, Instant},
166};
167use tracing::debug;
168
169/// Summary statistics for the Levenberg-Marquardt optimization process.
170#[derive(Debug, Clone)]
171pub struct LevenbergMarquardtSummary {
172 /// Initial cost value
173 pub initial_cost: f64,
174 /// Final cost value
175 pub final_cost: f64,
176 /// Ratio of actual to predicted reduction in cost
177 pub rho: f64,
178 /// Total number of iterations performed
179 pub iterations: usize,
180 /// Number of successful steps (cost decreased)
181 pub successful_steps: usize,
182 /// Number of unsuccessful steps (cost increased, damping increased)
183 pub unsuccessful_steps: usize,
184 /// Final damping parameter value
185 pub final_damping: f64,
186 /// Average cost reduction per iteration
187 pub average_cost_reduction: f64,
188 /// Maximum gradient norm encountered
189 pub max_gradient_norm: f64,
190 /// Final gradient norm
191 pub final_gradient_norm: f64,
192 /// Maximum parameter update norm
193 pub max_parameter_update_norm: f64,
194 /// Final parameter update norm
195 pub final_parameter_update_norm: f64,
196 /// Total time elapsed
197 pub total_time: Duration,
198 /// Average time per iteration
199 pub average_time_per_iteration: Duration,
200 /// Detailed per-iteration statistics history
201 pub iteration_history: Vec<IterationStats>,
202 /// Convergence status
203 pub convergence_status: OptimizationStatus,
204}
205
206impl Display for LevenbergMarquardtSummary {
207 fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
208 // Determine if converged
209 let converged = matches!(
210 self.convergence_status,
211 OptimizationStatus::Converged
212 | OptimizationStatus::CostToleranceReached
213 | OptimizationStatus::GradientToleranceReached
214 | OptimizationStatus::ParameterToleranceReached
215 );
216
217 writeln!(f, "Levenberg-Marquardt Final Result")?;
218
219 // Title with convergence status
220 if converged {
221 writeln!(f, "CONVERGED ({:?})", self.convergence_status)?;
222 } else {
223 writeln!(f, "DIVERGED ({:?})", self.convergence_status)?;
224 }
225
226 writeln!(f)?;
227 writeln!(f, "Cost:")?;
228 writeln!(f, " Initial: {:.6e}", self.initial_cost)?;
229 writeln!(f, " Final: {:.6e}", self.final_cost)?;
230 writeln!(
231 f,
232 " Reduction: {:.6e} ({:.2}%)",
233 self.initial_cost - self.final_cost,
234 100.0 * (self.initial_cost - self.final_cost) / self.initial_cost.max(1e-12)
235 )?;
236 writeln!(f)?;
237 writeln!(f, "Iterations:")?;
238 writeln!(f, " Total: {}", self.iterations)?;
239 writeln!(
240 f,
241 " Successful steps: {} ({:.1}%)",
242 self.successful_steps,
243 100.0 * self.successful_steps as f64 / self.iterations.max(1) as f64
244 )?;
245 writeln!(
246 f,
247 " Unsuccessful steps: {} ({:.1}%)",
248 self.unsuccessful_steps,
249 100.0 * self.unsuccessful_steps as f64 / self.iterations.max(1) as f64
250 )?;
251 writeln!(f)?;
252 writeln!(f, "Gradient:")?;
253 writeln!(f, " Max norm: {:.2e}", self.max_gradient_norm)?;
254 writeln!(f, " Final norm: {:.2e}", self.final_gradient_norm)?;
255 writeln!(f)?;
256 writeln!(f, "Parameter Update:")?;
257 writeln!(f, " Max norm: {:.2e}", self.max_parameter_update_norm)?;
258 writeln!(f, " Final norm: {:.2e}", self.final_parameter_update_norm)?;
259 writeln!(f)?;
260 writeln!(f, "Performance:")?;
261 writeln!(
262 f,
263 " Total time: {:.2}ms",
264 self.total_time.as_secs_f64() * 1000.0
265 )?;
266 writeln!(
267 f,
268 " Average per iteration: {:.2}ms",
269 self.average_time_per_iteration.as_secs_f64() * 1000.0
270 )?;
271
272 Ok(())
273 }
274}
275
276/// Per-iteration statistics for detailed logging (Ceres-style output).
277///
278/// Captures all relevant metrics for each optimization iteration, enabling
279/// detailed analysis and debugging of the optimization process.
280#[derive(Debug, Clone)]
281pub struct IterationStats {
282 /// Iteration number (0-indexed)
283 pub iteration: usize,
284 /// Cost function value at this iteration
285 pub cost: f64,
286 /// Change in cost from previous iteration
287 pub cost_change: f64,
288 /// L2 norm of the gradient (||J^T·r||)
289 pub gradient_norm: f64,
290 /// L2 norm of the parameter update step (||Δx||)
291 pub step_norm: f64,
292 /// Trust region ratio (ρ = actual_reduction / predicted_reduction)
293 pub tr_ratio: f64,
294 /// Trust region radius (damping parameter λ)
295 pub tr_radius: f64,
296 /// Linear solver iterations (0 for direct solvers like Cholesky)
297 pub ls_iter: usize,
298 /// Time taken for this iteration in milliseconds
299 pub iter_time_ms: f64,
300 /// Total elapsed time since optimization started in milliseconds
301 pub total_time_ms: f64,
302 /// Whether the step was accepted (true) or rejected (false)
303 pub accepted: bool,
304}
305
306impl IterationStats {
307 /// Print table header in Ceres-style format
308 pub fn print_header() {
309 debug!(
310 "{:>4} {:>13} {:>13} {:>13} {:>13} {:>11} {:>11} {:>7} {:>11} {:>13} {:>6}",
311 "iter",
312 "cost",
313 "cost_change",
314 "|gradient|",
315 "|step|",
316 "tr_ratio",
317 "tr_radius",
318 "ls_iter",
319 "iter_time",
320 "total_time",
321 "status"
322 );
323 }
324
325 /// Print single iteration line in Ceres-style format with scientific notation
326 pub fn print_line(&self) {
327 let status = if self.iteration == 0 {
328 "-"
329 } else if self.accepted {
330 "✓"
331 } else {
332 "✗"
333 };
334
335 debug!(
336 "{:>4} {:>13.6e} {:>13.2e} {:>13.2e} {:>13.2e} {:>11.2e} {:>11.2e} {:>7} {:>9.2}ms {:>11.2}ms {:>6}",
337 self.iteration,
338 self.cost,
339 self.cost_change,
340 self.gradient_norm,
341 self.step_norm,
342 self.tr_ratio,
343 self.tr_radius,
344 self.ls_iter,
345 self.iter_time_ms,
346 self.total_time_ms,
347 status
348 );
349 }
350}
351
352/// Configuration parameters for the Levenberg-Marquardt optimizer.
353///
354/// Controls the adaptive damping strategy, convergence criteria, and numerical stability
355/// enhancements for the Levenberg-Marquardt algorithm.
356///
357/// # Builder Pattern
358///
359/// All configuration options can be set using the builder pattern:
360///
361/// ```
362/// use apex_solver::optimizer::levenberg_marquardt::LevenbergMarquardtConfig;
363///
364/// let config = LevenbergMarquardtConfig::new()
365/// .with_max_iterations(100)
366/// .with_damping(1e-3)
367/// .with_damping_bounds(1e-12, 1e12)
368/// .with_jacobi_scaling(true);
369/// ```
370///
371/// # Damping Parameter Behavior
372///
373/// The damping parameter λ controls the trade-off between Gauss-Newton and gradient descent:
374///
375/// - **Initial damping** (`damping`): Starting value (default: 1e-4)
376/// - **Damping bounds** (`damping_min`, `damping_max`): Valid range (default: 1e-12 to 1e12)
377/// - **Adaptation**: Automatically adjusted based on step quality using Nielsen's formula
378///
379/// # Convergence Criteria
380///
381/// The optimizer terminates when ANY of the following conditions is met:
382///
383/// - **Cost tolerance**: `|cost_k - cost_{k-1}| < cost_tolerance`
384/// - **Parameter tolerance**: `||step|| < parameter_tolerance`
385/// - **Gradient tolerance**: `||J^T·r|| < gradient_tolerance`
386/// - **Maximum iterations**: `iteration >= max_iterations`
387/// - **Timeout**: `elapsed_time >= timeout`
388///
389/// # See Also
390///
391/// - [`LevenbergMarquardt`] - The solver that uses this configuration
392/// - [`GaussNewtonConfig`](crate::GaussNewtonConfig) - Undamped variant
393/// - [`DogLegConfig`](crate::DogLegConfig) - Trust region alternative
394#[derive(Clone)]
395pub struct LevenbergMarquardtConfig {
396 /// Type of linear solver for the linear systems
397 pub linear_solver_type: LinearSolverType,
398 /// Maximum number of iterations
399 pub max_iterations: usize,
400 /// Convergence tolerance for cost function
401 pub cost_tolerance: f64,
402 /// Convergence tolerance for parameter updates
403 pub parameter_tolerance: f64,
404 /// Convergence tolerance for gradient norm
405 pub gradient_tolerance: f64,
406 /// Timeout duration
407 pub timeout: Option<Duration>,
408 /// Initial damping parameter
409 pub damping: f64,
410 /// Minimum damping parameter
411 pub damping_min: f64,
412 /// Maximum damping parameter
413 pub damping_max: f64,
414 /// Damping increase factor (when step rejected)
415 pub damping_increase_factor: f64,
416 /// Damping decrease factor (when step accepted)
417 pub damping_decrease_factor: f64,
418 /// Damping nu parameter
419 pub damping_nu: f64,
420 /// Trust region radius
421 pub trust_region_radius: f64,
422 /// Minimum step quality for acceptance
423 pub min_step_quality: f64,
424 /// Good step quality threshold
425 pub good_step_quality: f64,
426 /// Minimum diagonal value for regularization
427 pub min_diagonal: f64,
428 /// Maximum diagonal value for regularization
429 pub max_diagonal: f64,
430 /// Minimum objective function cutoff (optional early termination)
431 ///
432 /// If set, optimization terminates when cost falls below this threshold.
433 /// Useful for early stopping when a "good enough" solution is acceptable.
434 ///
435 /// Default: None (disabled)
436 pub min_cost_threshold: Option<f64>,
437 /// Minimum trust region radius before termination
438 ///
439 /// When the trust region radius falls below this value, the optimizer
440 /// terminates as it indicates the search has converged or the problem
441 /// is ill-conditioned. Matches Ceres Solver's min_trust_region_radius.
442 ///
443 /// Default: 1e-32 (Ceres-compatible)
444 pub min_trust_region_radius: f64,
445 /// Maximum condition number for Jacobian matrix (optional check)
446 ///
447 /// If set, the optimizer checks if condition_number(J^T*J) exceeds this
448 /// threshold and terminates with IllConditionedJacobian status.
449 /// Note: Computing condition number is expensive, so this is disabled by default.
450 ///
451 /// Default: None (disabled)
452 pub max_condition_number: Option<f64>,
453 /// Minimum relative cost decrease for step acceptance
454 ///
455 /// Used in computing step quality (rho = actual_reduction / predicted_reduction).
456 /// Steps with rho < min_relative_decrease are rejected. Matches Ceres Solver's
457 /// min_relative_decrease parameter.
458 ///
459 /// Default: 1e-3 (Ceres-compatible)
460 pub min_relative_decrease: f64,
461 /// Use Jacobi column scaling (preconditioning)
462 ///
463 /// When enabled, normalizes Jacobian columns by their L2 norm before solving.
464 /// This can improve convergence for problems with mixed parameter scales
465 /// (e.g., positions in meters + angles in radians) but adds ~5-10% overhead.
466 ///
467 /// Default: false (to avoid performance overhead and faster convergence)
468 pub use_jacobi_scaling: bool,
469 /// Compute per-variable covariance matrices (uncertainty estimation)
470 ///
471 /// When enabled, computes covariance by inverting the Hessian matrix after
472 /// convergence. The full covariance matrix is extracted into per-variable
473 /// blocks stored in both Variable structs and SolverResult.
474 ///
475 /// Default: false (to avoid performance overhead)
476 pub compute_covariances: bool,
477 // Note: Visualization is now handled via the observer pattern.
478 // Use `solver.add_observer(RerunObserver::new(true)?)` to enable visualization.
479 // This provides cleaner separation of concerns and allows multiple observers.
480}
481
482impl Default for LevenbergMarquardtConfig {
483 fn default() -> Self {
484 Self {
485 linear_solver_type: LinearSolverType::default(),
486 // Ceres Solver default: 50 (changed from 100 for compatibility)
487 max_iterations: 50,
488 // Ceres Solver default: 1e-6 (changed from 1e-8 for compatibility)
489 cost_tolerance: 1e-6,
490 // Ceres Solver default: 1e-8 (unchanged)
491 parameter_tolerance: 1e-8,
492 // Ceres Solver default: 1e-10 (changed from 1e-8 for compatibility)
493 // Note: Typically should be 1e-4 * cost_tolerance per Ceres docs
494 gradient_tolerance: 1e-10,
495 timeout: None,
496 damping: 1e-4,
497 damping_min: 1e-12,
498 damping_max: 1e12,
499 damping_increase_factor: 10.0,
500 damping_decrease_factor: 0.3,
501 damping_nu: 2.0,
502 trust_region_radius: 1e4,
503 min_step_quality: 0.0,
504 good_step_quality: 0.75,
505 min_diagonal: 1e-6,
506 max_diagonal: 1e32,
507 // New Ceres-compatible parameters
508 min_cost_threshold: None,
509 min_trust_region_radius: 1e-32,
510 max_condition_number: None,
511 min_relative_decrease: 1e-3,
512 // Existing parameters
513 use_jacobi_scaling: false,
514 compute_covariances: false,
515 }
516 }
517}
518
519impl LevenbergMarquardtConfig {
520 /// Create a new Levenberg-Marquardt configuration with default values.
521 pub fn new() -> Self {
522 Self::default()
523 }
524
525 /// Set the linear solver type
526 pub fn with_linear_solver_type(mut self, linear_solver_type: LinearSolverType) -> Self {
527 self.linear_solver_type = linear_solver_type;
528 self
529 }
530
531 /// Set the maximum number of iterations
532 pub fn with_max_iterations(mut self, max_iterations: usize) -> Self {
533 self.max_iterations = max_iterations;
534 self
535 }
536
537 /// Set the cost tolerance
538 pub fn with_cost_tolerance(mut self, cost_tolerance: f64) -> Self {
539 self.cost_tolerance = cost_tolerance;
540 self
541 }
542
543 /// Set the parameter tolerance
544 pub fn with_parameter_tolerance(mut self, parameter_tolerance: f64) -> Self {
545 self.parameter_tolerance = parameter_tolerance;
546 self
547 }
548
549 /// Set the gradient tolerance
550 pub fn with_gradient_tolerance(mut self, gradient_tolerance: f64) -> Self {
551 self.gradient_tolerance = gradient_tolerance;
552 self
553 }
554
555 /// Set the timeout duration
556 pub fn with_timeout(mut self, timeout: Duration) -> Self {
557 self.timeout = Some(timeout);
558 self
559 }
560
561 /// Set the initial damping parameter.
562 pub fn with_damping(mut self, damping: f64) -> Self {
563 self.damping = damping;
564 self
565 }
566
567 /// Set the damping parameter bounds.
568 pub fn with_damping_bounds(mut self, min: f64, max: f64) -> Self {
569 self.damping_min = min;
570 self.damping_max = max;
571 self
572 }
573
574 /// Set the damping adjustment factors.
575 pub fn with_damping_factors(mut self, increase: f64, decrease: f64) -> Self {
576 self.damping_increase_factor = increase;
577 self.damping_decrease_factor = decrease;
578 self
579 }
580
581 /// Set the trust region parameters.
582 pub fn with_trust_region(mut self, radius: f64, min_quality: f64, good_quality: f64) -> Self {
583 self.trust_region_radius = radius;
584 self.min_step_quality = min_quality;
585 self.good_step_quality = good_quality;
586 self
587 }
588
589 /// Set minimum objective function cutoff for early termination.
590 ///
591 /// When set, optimization terminates with MinCostThresholdReached status
592 /// if the cost falls below this threshold. Useful for early stopping when
593 /// a "good enough" solution is acceptable.
594 pub fn with_min_cost_threshold(mut self, min_cost: f64) -> Self {
595 self.min_cost_threshold = Some(min_cost);
596 self
597 }
598
599 /// Set minimum trust region radius before termination.
600 ///
601 /// When the trust region radius falls below this value, optimization
602 /// terminates with TrustRegionRadiusTooSmall status.
603 /// Default: 1e-32 (Ceres-compatible)
604 pub fn with_min_trust_region_radius(mut self, min_radius: f64) -> Self {
605 self.min_trust_region_radius = min_radius;
606 self
607 }
608
609 /// Set maximum condition number for Jacobian matrix.
610 ///
611 /// If set, the optimizer checks if condition_number(J^T*J) exceeds this
612 /// threshold and terminates with IllConditionedJacobian status.
613 /// Note: Computing condition number is expensive, disabled by default.
614 pub fn with_max_condition_number(mut self, max_cond: f64) -> Self {
615 self.max_condition_number = Some(max_cond);
616 self
617 }
618
619 /// Set minimum relative cost decrease for step acceptance.
620 ///
621 /// Steps with rho = (actual_reduction / predicted_reduction) below this
622 /// threshold are rejected. Default: 1e-3 (Ceres-compatible)
623 pub fn with_min_relative_decrease(mut self, min_decrease: f64) -> Self {
624 self.min_relative_decrease = min_decrease;
625 self
626 }
627
628 /// Enable or disable Jacobi column scaling (preconditioning).
629 ///
630 /// When enabled, normalizes Jacobian columns by their L2 norm before solving.
631 /// Can improve convergence for mixed-scale problems but adds ~5-10% overhead.
632 pub fn with_jacobi_scaling(mut self, use_jacobi_scaling: bool) -> Self {
633 self.use_jacobi_scaling = use_jacobi_scaling;
634 self
635 }
636
637 /// Enable or disable covariance computation (uncertainty estimation).
638 ///
639 /// When enabled, computes the full covariance matrix by inverting the Hessian
640 /// after convergence, then extracts per-variable covariance blocks.
641 pub fn with_compute_covariances(mut self, compute_covariances: bool) -> Self {
642 self.compute_covariances = compute_covariances;
643 self
644 }
645
646 /// Enable real-time visualization (graphical debugging).
647 ///
648 /// When enabled, optimization progress is logged to a Rerun viewer with:
649 /// - Time series plots of cost, gradient norm, damping, step quality
650 /// - Sparse Hessian matrix visualization as heat map
651 /// - Gradient vector visualization
652 /// - Real-time manifold state updates (for SE2/SE3 problems)
653 ///
654 /// **Note:** Requires the `visualization` feature to be enabled in `Cargo.toml`.
655 /// Use `verbose` for terminal logging.
656 ///
657 /// # Arguments
658 ///
659 /// * `enable` - Whether to enable visualization
660 // Note: with_visualization() method has been removed.
661 // Use the observer pattern instead:
662 // let mut solver = LevenbergMarquardt::with_config(config);
663 // solver.add_observer(RerunObserver::new(true)?);
664 // This provides cleaner separation and allows multiple observers.
665 /// Print configuration parameters (verbose mode only)
666 pub fn print_configuration(&self) {
667 debug!(
668 "Configuration:\n Solver: Levenberg-Marquardt\n Linear solver: {:?}\n Convergence Criteria:\n Max iterations: {}\n Cost tolerance: {:.2e}\n Parameter tolerance: {:.2e}\n Gradient tolerance: {:.2e}\n Timeout: {:?}\n Damping Parameters:\n Initial damping: {:.2e}\n Damping range: [{:.2e}, {:.2e}]\n Increase factor: {:.2}\n Decrease factor: {:.2}\n Trust Region:\n Initial radius: {:.2e}\n Min step quality: {:.2}\n Good step quality: {:.2}\n Numerical Settings:\n Jacobi scaling: {}\n Compute covariances: {}",
669 self.linear_solver_type,
670 self.max_iterations,
671 self.cost_tolerance,
672 self.parameter_tolerance,
673 self.gradient_tolerance,
674 self.timeout,
675 self.damping,
676 self.damping_min,
677 self.damping_max,
678 self.damping_increase_factor,
679 self.damping_decrease_factor,
680 self.trust_region_radius,
681 self.min_step_quality,
682 self.good_step_quality,
683 if self.use_jacobi_scaling {
684 "enabled"
685 } else {
686 "disabled"
687 },
688 if self.compute_covariances {
689 "enabled"
690 } else {
691 "disabled"
692 }
693 );
694 }
695}
696
697/// State for optimization iteration
698struct LinearizerResult {
699 variables: HashMap<String, VariableEnum>,
700 variable_index_map: HashMap<String, usize>,
701 sorted_vars: Vec<String>,
702 symbolic_structure: SymbolicStructure,
703 current_cost: f64,
704 initial_cost: f64,
705}
706
707/// Result from step computation
708struct StepResult {
709 step: Mat<f64>,
710 gradient_norm: f64,
711 predicted_reduction: f64,
712}
713
714/// Result from step evaluation
715struct StepEvaluation {
716 accepted: bool,
717 cost_reduction: f64,
718 rho: f64,
719}
720
721/// Levenberg-Marquardt solver for nonlinear least squares optimization.
722///
723/// Implements the damped Gauss-Newton method with adaptive damping parameter λ that
724/// interpolates between Gauss-Newton and gradient descent based on step quality.
725///
726/// # Algorithm
727///
728/// At each iteration k:
729/// 1. Compute residual `r(xₖ)` and Jacobian `J(xₖ)`
730/// 2. Solve augmented system: `(J^T·J + λI)·h = -J^T·r`
731/// 3. Evaluate step quality: `ρ = (actual reduction) / (predicted reduction)`
732/// 4. If `ρ > 0`: Accept step and update `xₖ₊₁ = xₖ ⊕ h`, decrease λ
733/// 5. If `ρ ≤ 0`: Reject step (keep `xₖ₊₁ = xₖ`), increase λ
734/// 6. Check convergence criteria
735///
736/// The damping parameter λ is updated using Nielsen's smooth formula:
737/// `λₖ₊₁ = λₖ · max(1/3, 1 - (2ρ - 1)³)` for accepted steps,
738/// or `λₖ₊₁ = λₖ · ν` (with increasing ν) for rejected steps.
739///
740/// # Examples
741///
742/// ```no_run
743/// use apex_solver::optimizer::levenberg_marquardt::{LevenbergMarquardtConfig, LevenbergMarquardt};
744/// use apex_solver::linalg::LinearSolverType;
745///
746/// # fn main() {
747/// let config = LevenbergMarquardtConfig::new()
748/// .with_max_iterations(100)
749/// .with_damping(1e-3)
750/// .with_damping_bounds(1e-12, 1e12)
751/// .with_jacobi_scaling(true);
752///
753/// let mut solver = LevenbergMarquardt::with_config(config);
754/// # }
755/// ```
756///
757/// # See Also
758///
759/// - [`LevenbergMarquardtConfig`] - Configuration options
760/// - [`GaussNewton`](crate::GaussNewton) - Undamped variant (faster but less robust)
761/// - [`DogLeg`](crate::DogLeg) - Alternative trust region method
762pub struct LevenbergMarquardt {
763 config: LevenbergMarquardtConfig,
764 jacobi_scaling: Option<SparseColMat<usize, f64>>,
765 observers: OptObserverVec,
766}
767
768impl Default for LevenbergMarquardt {
769 fn default() -> Self {
770 Self::new()
771 }
772}
773
774impl LevenbergMarquardt {
775 /// Create a new Levenberg-Marquardt solver with default configuration.
776 pub fn new() -> Self {
777 Self::with_config(LevenbergMarquardtConfig::default())
778 }
779
780 /// Create a new Levenberg-Marquardt solver with the given configuration.
781 pub fn with_config(config: LevenbergMarquardtConfig) -> Self {
782 Self {
783 config,
784 jacobi_scaling: None,
785 observers: OptObserverVec::new(),
786 }
787 }
788
789 /// Add an observer to monitor optimization progress.
790 ///
791 /// Observers are notified at each iteration with the current variable values.
792 /// This enables real-time visualization, logging, metrics collection, etc.
793 ///
794 /// # Arguments
795 ///
796 /// * `observer` - Any type implementing `OptObserver`
797 ///
798 /// # Examples
799 ///
800 /// ```no_run
801 /// use apex_solver::{LevenbergMarquardt, LevenbergMarquardtConfig};
802 /// # use apex_solver::core::problem::Problem;
803 /// # use std::collections::HashMap;
804 ///
805 /// # fn main() -> Result<(), Box<dyn std::error::Error>> {
806 /// let mut solver = LevenbergMarquardt::new();
807 ///
808 /// #[cfg(feature = "visualization")]
809 /// {
810 /// use apex_solver::observers::RerunObserver;
811 /// let rerun_observer = RerunObserver::new(true)?;
812 /// solver.add_observer(rerun_observer);
813 /// }
814 ///
815 /// // ... optimize ...
816 /// # Ok(())
817 /// # }
818 /// ```
819 pub fn add_observer(&mut self, observer: impl crate::optimizer::OptObserver + 'static) {
820 self.observers.add(observer);
821 }
822
823 /// Create the appropriate linear solver based on configuration
824 fn create_linear_solver(&self) -> Box<dyn SparseLinearSolver> {
825 match self.config.linear_solver_type {
826 LinearSolverType::SparseCholesky => Box::new(SparseCholeskySolver::new()),
827 LinearSolverType::SparseQR => Box::new(SparseQRSolver::new()),
828 }
829 }
830
831 /// Update damping parameter based on step quality using trust region approach
832 /// Reference: Introduction to Optimization and Data Fitting
833 /// Algorithm 6.18
834 fn update_damping(&mut self, rho: f64) -> bool {
835 if rho > 0.0 {
836 // Step accepted - decrease damping
837 let coff = 2.0 * rho - 1.0;
838 self.config.damping *= (1.0_f64 / 3.0).max(1.0 - coff * coff * coff);
839 self.config.damping = self.config.damping.max(self.config.damping_min);
840 self.config.damping_nu = 2.0;
841 true
842 } else {
843 // Step rejected - increase damping
844 self.config.damping *= self.config.damping_nu;
845 self.config.damping_nu *= 2.0;
846 self.config.damping = self.config.damping.min(self.config.damping_max);
847 false
848 }
849 }
850
851 /// Compute step quality ratio (actual vs predicted reduction)
852 /// Reference: Introduction to Optimization and Data Fitting
853 /// Reference: Damping parameter in marquardt's method
854 /// Formula 2.2
855 fn compute_step_quality(
856 &self,
857 current_cost: f64,
858 new_cost: f64,
859 predicted_reduction: f64,
860 ) -> f64 {
861 let actual_reduction = current_cost - new_cost;
862 if predicted_reduction.abs() < 1e-15 {
863 if actual_reduction > 0.0 { 1.0 } else { 0.0 }
864 } else {
865 actual_reduction / predicted_reduction
866 }
867 }
868
869 /// Compute predicted cost reduction from linear model
870 /// Standard LM formula: 0.5 * step^T * (damping * step - gradient)
871 fn compute_predicted_reduction(&self, step: &Mat<f64>, gradient: &Mat<f64>) -> f64 {
872 // Standard Levenberg-Marquardt predicted reduction formula
873 // predicted_reduction = -step^T * gradient - 0.5 * step^T * H * step
874 // = 0.5 * step^T * (damping * step - gradient)
875 let diff = self.config.damping * step - gradient;
876 (0.5 * step.transpose() * &diff)[(0, 0)]
877 }
878
879 /// Check convergence criteria
880 /// Check convergence using comprehensive termination criteria.
881 ///
882 /// Implements 9 termination criteria following Ceres Solver standards:
883 ///
884 /// 1. **Gradient Norm (First-Order Optimality)**: ||g||∞ ≤ gradient_tolerance
885 /// 2. **Parameter Change Tolerance**: ||h|| ≤ parameter_tolerance * (||x|| + parameter_tolerance)
886 /// 3. **Function Value Change Tolerance**: |ΔF| < cost_tolerance * F
887 /// 4. **Objective Function Cutoff**: F_new < min_cost_threshold (optional)
888 /// 5. **Trust Region Radius**: radius < min_trust_region_radius
889 /// 6. **Singular/Ill-Conditioned Jacobian**: Detected during linear solve
890 /// 7. **Invalid Numerical Values**: NaN or Inf in cost or parameters
891 /// 8. **Maximum Iterations**: iteration >= max_iterations
892 /// 9. **Timeout**: elapsed >= timeout
893 ///
894 /// # Arguments
895 ///
896 /// * `iteration` - Current iteration number
897 /// * `current_cost` - Cost before applying the step
898 /// * `new_cost` - Cost after applying the step
899 /// * `parameter_norm` - L2 norm of current parameter vector ||x||
900 /// * `parameter_update_norm` - L2 norm of parameter update step ||h||
901 /// * `gradient_norm` - Infinity norm of gradient ||g||∞
902 /// * `trust_region_radius` - Current trust region radius
903 /// * `elapsed` - Elapsed time since optimization start
904 /// * `step_accepted` - Whether the current step was accepted
905 ///
906 /// # Returns
907 ///
908 /// `Some(OptimizationStatus)` if any termination criterion is satisfied, `None` otherwise.
909 #[allow(clippy::too_many_arguments)]
910 fn check_convergence(
911 &self,
912 iteration: usize,
913 current_cost: f64,
914 new_cost: f64,
915 parameter_norm: f64,
916 parameter_update_norm: f64,
917 gradient_norm: f64,
918 trust_region_radius: f64,
919 elapsed: Duration,
920 step_accepted: bool,
921 ) -> Option<OptimizationStatus> {
922 // CRITICAL SAFETY CHECKS (perform first, before convergence checks)
923
924 // CRITERION 7: Invalid Numerical Values (NaN/Inf)
925 // Always check for numerical instability first
926 if !new_cost.is_finite() || !parameter_update_norm.is_finite() || !gradient_norm.is_finite()
927 {
928 return Some(OptimizationStatus::InvalidNumericalValues);
929 }
930
931 // CRITERION 9: Timeout
932 // Check wall-clock time limit
933 if let Some(timeout) = self.config.timeout
934 && elapsed >= timeout
935 {
936 return Some(OptimizationStatus::Timeout);
937 }
938
939 // CRITERION 8: Maximum Iterations
940 // Check iteration count limit
941 if iteration >= self.config.max_iterations {
942 return Some(OptimizationStatus::MaxIterationsReached);
943 }
944 // CONVERGENCE CRITERIA (only check after successful steps)
945
946 // Only check convergence criteria after accepted steps
947 // (rejected steps don't indicate convergence)
948 if !step_accepted {
949 return None;
950 }
951
952 // CRITERION 1: Gradient Norm (First-Order Optimality)
953 // Check if gradient infinity norm is below threshold
954 // This indicates we're at a critical point (local minimum, saddle, or maximum)
955 if gradient_norm < self.config.gradient_tolerance {
956 return Some(OptimizationStatus::GradientToleranceReached);
957 }
958
959 // Only check parameter and cost criteria after first iteration
960 if iteration > 0 {
961 // CRITERION 2: Parameter Change Tolerance (xtol)
962 // Ceres formula: ||h|| ≤ ε_param * (||x|| + ε_param)
963 // This is a relative measure that scales with parameter magnitude
964 let relative_step_tolerance = self.config.parameter_tolerance
965 * (parameter_norm + self.config.parameter_tolerance);
966
967 if parameter_update_norm <= relative_step_tolerance {
968 return Some(OptimizationStatus::ParameterToleranceReached);
969 }
970
971 // CRITERION 3: Function Value Change Tolerance (ftol)
972 // Ceres formula: |ΔF| < ε_cost * F
973 // Check relative cost change (not absolute)
974 let cost_change = (current_cost - new_cost).abs();
975 let relative_cost_change = cost_change / current_cost.max(1e-10); // Avoid division by zero
976
977 if relative_cost_change < self.config.cost_tolerance {
978 return Some(OptimizationStatus::CostToleranceReached);
979 }
980 }
981
982 // CRITERION 4: Objective Function Cutoff (optional early stopping)
983 // Useful for "good enough" solutions
984 if let Some(min_cost) = self.config.min_cost_threshold
985 && new_cost < min_cost
986 {
987 return Some(OptimizationStatus::MinCostThresholdReached);
988 }
989
990 // CRITERION 5: Trust Region Radius
991 // If trust region has collapsed, optimization has converged or problem is ill-conditioned
992 if trust_region_radius < self.config.min_trust_region_radius {
993 return Some(OptimizationStatus::TrustRegionRadiusTooSmall);
994 }
995
996 // CRITERION 6: Singular/Ill-Conditioned Jacobian
997 // Note: This is typically detected during the linear solve and handled there
998 // The max_condition_number check would be expensive to compute here
999 // If linear solve fails, it returns an error that's converted to NumericalFailure
1000
1001 // No termination criterion satisfied
1002 None
1003 }
1004
1005 /// Compute total parameter vector norm ||x||.
1006 ///
1007 /// Computes the L2 norm of all parameter vectors concatenated together.
1008 /// This is used in the relative parameter tolerance check.
1009 ///
1010 /// # Arguments
1011 ///
1012 /// * `variables` - Map of variable names to their current values
1013 ///
1014 /// # Returns
1015 ///
1016 /// The L2 norm of the concatenated parameter vector
1017 fn compute_parameter_norm(variables: &HashMap<String, VariableEnum>) -> f64 {
1018 variables
1019 .values()
1020 .map(|v| {
1021 let vec = v.to_vector();
1022 vec.norm_squared()
1023 })
1024 .sum::<f64>()
1025 .sqrt()
1026 }
1027
1028 /// Create Jacobi scaling matrix from Jacobian
1029 fn create_jacobi_scaling(
1030 &self,
1031 jacobian: &SparseColMat<usize, f64>,
1032 ) -> Result<SparseColMat<usize, f64>, OptimizerError> {
1033 let cols = jacobian.ncols();
1034 let jacobi_scaling_vec: Vec<Triplet<usize, usize, f64>> = (0..cols)
1035 .map(|c| {
1036 // Compute column norm: sqrt(sum(J_col^2))
1037 let col_norm_squared: f64 = jacobian
1038 .triplet_iter()
1039 .filter(|t| t.col == c)
1040 .map(|t| t.val * t.val)
1041 .sum();
1042 let col_norm = col_norm_squared.sqrt();
1043 // Scaling factor: 1.0 / (1.0 + col_norm)
1044 let scaling = 1.0 / (1.0 + col_norm);
1045 Triplet::new(c, c, scaling)
1046 })
1047 .collect();
1048
1049 SparseColMat::try_new_from_triplets(cols, cols, &jacobi_scaling_vec)
1050 .map_err(|e| OptimizerError::JacobiScalingCreation(e.to_string()).log_with_source(e))
1051 }
1052
1053 /// Initialize optimization state from problem and initial parameters
1054 fn initialize_optimization_state(
1055 &self,
1056 problem: &Problem,
1057 initial_params: &HashMap<String, (ManifoldType, DVector<f64>)>,
1058 ) -> Result<LinearizerResult, error::ApexSolverError> {
1059 // Initialize variables from initial values
1060 let variables = problem.initialize_variables(initial_params);
1061
1062 // Create column mapping for variables
1063 let mut variable_index_map = HashMap::new();
1064 let mut col_offset = 0;
1065 let mut sorted_vars: Vec<String> = variables.keys().cloned().collect();
1066 sorted_vars.sort();
1067
1068 for var_name in &sorted_vars {
1069 variable_index_map.insert(var_name.clone(), col_offset);
1070 col_offset += variables[var_name].get_size();
1071 }
1072
1073 // Build symbolic structure for sparse operations
1074 let symbolic_structure =
1075 problem.build_symbolic_structure(&variables, &variable_index_map, col_offset)?;
1076
1077 // Initial cost evaluation (residual only, no Jacobian needed)
1078 let residual = problem.compute_residual_sparse(&variables)?;
1079 let current_cost = compute_cost(&residual);
1080 let initial_cost = current_cost;
1081
1082 Ok(LinearizerResult {
1083 variables,
1084 variable_index_map,
1085 sorted_vars,
1086 symbolic_structure,
1087 current_cost,
1088 initial_cost,
1089 })
1090 }
1091
1092 /// Process Jacobian by creating and applying Jacobi scaling if enabled
1093 fn process_jacobian(
1094 &mut self,
1095 jacobian: &SparseColMat<usize, f64>,
1096 iteration: usize,
1097 ) -> Result<SparseColMat<usize, f64>, OptimizerError> {
1098 // Create Jacobi scaling on first iteration if enabled
1099 if iteration == 0 {
1100 let scaling = self.create_jacobi_scaling(jacobian)?;
1101 self.jacobi_scaling = Some(scaling);
1102 }
1103 let scaling = self
1104 .jacobi_scaling
1105 .as_ref()
1106 .ok_or_else(|| OptimizerError::JacobiScalingNotInitialized.log())?;
1107 Ok(jacobian * scaling)
1108 }
1109
1110 /// Compute optimization step by solving the augmented system
1111 fn compute_levenberg_marquardt_step(
1112 &self,
1113 residuals: &Mat<f64>,
1114 scaled_jacobian: &SparseColMat<usize, f64>,
1115 linear_solver: &mut Box<dyn SparseLinearSolver>,
1116 ) -> Result<StepResult, OptimizerError> {
1117 // Solve augmented equation: (J_scaled^T * J_scaled + λI) * dx_scaled = -J_scaled^T * r
1118 let residuals_owned = residuals.as_ref().to_owned();
1119 let scaled_step = linear_solver
1120 .solve_augmented_equation(&residuals_owned, scaled_jacobian, self.config.damping)
1121 .map_err(|e| OptimizerError::LinearSolveFailed(e.to_string()).log_with_source(e))?;
1122
1123 // Get cached gradient and Hessian from the solver
1124 let gradient = linear_solver.get_gradient().ok_or_else(|| {
1125 OptimizerError::NumericalInstability("Gradient not available".into()).log()
1126 })?;
1127 let _hessian = linear_solver.get_hessian().ok_or_else(|| {
1128 OptimizerError::NumericalInstability("Hessian not available".into()).log()
1129 })?;
1130 let gradient_norm = gradient.norm_l2();
1131
1132 // Apply inverse Jacobi scaling to get final step (if enabled)
1133 let step = if self.config.use_jacobi_scaling {
1134 let scaling = self
1135 .jacobi_scaling
1136 .as_ref()
1137 .ok_or_else(|| OptimizerError::JacobiScalingNotInitialized.log())?;
1138 &scaled_step * scaling
1139 } else {
1140 scaled_step
1141 };
1142
1143 // Compute predicted reduction using scaled values
1144 let predicted_reduction = self.compute_predicted_reduction(&step, gradient);
1145
1146 Ok(StepResult {
1147 step,
1148 gradient_norm,
1149 predicted_reduction,
1150 })
1151 }
1152
1153 /// Evaluate and apply step, handling acceptance/rejection based on step quality
1154 fn evaluate_and_apply_step(
1155 &mut self,
1156 step_result: &StepResult,
1157 state: &mut LinearizerResult,
1158 problem: &Problem,
1159 ) -> error::ApexSolverResult<StepEvaluation> {
1160 // Apply parameter updates using manifold operations
1161 let _step_norm = apply_parameter_step(
1162 &mut state.variables,
1163 step_result.step.as_ref(),
1164 &state.sorted_vars,
1165 );
1166
1167 // Compute new cost (residual only, no Jacobian needed for step evaluation)
1168 let new_residual = problem.compute_residual_sparse(&state.variables)?;
1169 let new_cost = compute_cost(&new_residual);
1170
1171 // Compute step quality
1172 let rho = self.compute_step_quality(
1173 state.current_cost,
1174 new_cost,
1175 step_result.predicted_reduction,
1176 );
1177
1178 // Update damping and decide whether to accept step
1179 let accepted = self.update_damping(rho);
1180
1181 let cost_reduction = if accepted {
1182 // Accept the step - parameters already updated
1183 let reduction = state.current_cost - new_cost;
1184 state.current_cost = new_cost;
1185 reduction
1186 } else {
1187 // Reject the step - revert parameter changes
1188 apply_negative_parameter_step(
1189 &mut state.variables,
1190 step_result.step.as_ref(),
1191 &state.sorted_vars,
1192 );
1193 0.0
1194 };
1195
1196 Ok(StepEvaluation {
1197 accepted,
1198 cost_reduction,
1199 rho,
1200 })
1201 }
1202
1203 /// Create optimization summary
1204 #[allow(clippy::too_many_arguments)]
1205 fn create_summary(
1206 &self,
1207 initial_cost: f64,
1208 final_cost: f64,
1209 rho: f64,
1210 iterations: usize,
1211 successful_steps: usize,
1212 unsuccessful_steps: usize,
1213 max_gradient_norm: f64,
1214 final_gradient_norm: f64,
1215 max_parameter_update_norm: f64,
1216 final_parameter_update_norm: f64,
1217 total_cost_reduction: f64,
1218 total_time: Duration,
1219 iteration_history: Vec<IterationStats>,
1220 convergence_status: OptimizationStatus,
1221 ) -> LevenbergMarquardtSummary {
1222 LevenbergMarquardtSummary {
1223 initial_cost,
1224 final_cost,
1225 rho,
1226 iterations,
1227 successful_steps,
1228 unsuccessful_steps,
1229 final_damping: self.config.damping,
1230 average_cost_reduction: if iterations > 0 {
1231 total_cost_reduction / iterations as f64
1232 } else {
1233 0.0
1234 },
1235 max_gradient_norm,
1236 final_gradient_norm,
1237 max_parameter_update_norm,
1238 final_parameter_update_norm,
1239 total_time,
1240 average_time_per_iteration: if iterations > 0 {
1241 total_time / iterations as u32
1242 } else {
1243 Duration::from_secs(0)
1244 },
1245 iteration_history,
1246 convergence_status,
1247 }
1248 }
1249
1250 pub fn optimize(
1251 &mut self,
1252 problem: &Problem,
1253 initial_params: &HashMap<String, (ManifoldType, DVector<f64>)>,
1254 ) -> Result<SolverResult<HashMap<String, VariableEnum>>, error::ApexSolverError> {
1255 let start_time = Instant::now();
1256 let mut iteration = 0;
1257 let mut cost_evaluations = 1; // Initial cost evaluation
1258 let mut jacobian_evaluations = 0;
1259 let mut successful_steps = 0;
1260 let mut unsuccessful_steps = 0;
1261
1262 // Initialize optimization state
1263 let mut state = self.initialize_optimization_state(problem, initial_params)?;
1264
1265 // Create linear solver
1266 let mut linear_solver = self.create_linear_solver();
1267
1268 // Initialize summary tracking variables
1269 let mut max_gradient_norm: f64 = 0.0;
1270 let mut max_parameter_update_norm: f64 = 0.0;
1271 let mut total_cost_reduction = 0.0;
1272 let mut final_gradient_norm;
1273 let mut final_parameter_update_norm;
1274
1275 // Initialize iteration statistics tracking
1276 let mut iteration_stats = Vec::with_capacity(self.config.max_iterations);
1277 let mut previous_cost = state.current_cost;
1278
1279 // Print configuration and header if debug level is enabled
1280 if tracing::enabled!(tracing::Level::DEBUG) {
1281 self.config.print_configuration();
1282 IterationStats::print_header();
1283 }
1284
1285 // Main optimization loop
1286 loop {
1287 let iter_start = Instant::now();
1288 // Evaluate residuals and Jacobian
1289 let (residuals, jacobian) = problem.compute_residual_and_jacobian_sparse(
1290 &state.variables,
1291 &state.variable_index_map,
1292 &state.symbolic_structure,
1293 )?;
1294 jacobian_evaluations += 1;
1295
1296 // Process Jacobian (apply scaling if enabled)
1297 let scaled_jacobian = if self.config.use_jacobi_scaling {
1298 self.process_jacobian(&jacobian, iteration)?
1299 } else {
1300 jacobian
1301 };
1302
1303 // Compute optimization step
1304 let step_result = self.compute_levenberg_marquardt_step(
1305 &residuals,
1306 &scaled_jacobian,
1307 &mut linear_solver,
1308 )?;
1309
1310 // Update tracking variables
1311 max_gradient_norm = max_gradient_norm.max(step_result.gradient_norm);
1312 final_gradient_norm = step_result.gradient_norm;
1313 let step_norm = step_result.step.norm_l2();
1314 max_parameter_update_norm = max_parameter_update_norm.max(step_norm);
1315 final_parameter_update_norm = step_norm;
1316
1317 // Evaluate and apply step (handles accept/reject)
1318 let step_eval = self.evaluate_and_apply_step(&step_result, &mut state, problem)?;
1319 cost_evaluations += 1;
1320
1321 // Update counters based on acceptance
1322 if step_eval.accepted {
1323 successful_steps += 1;
1324 total_cost_reduction += step_eval.cost_reduction;
1325 } else {
1326 unsuccessful_steps += 1;
1327 }
1328
1329 // OPTIMIZATION: Only collect iteration statistics if debug level is enabled
1330 // This eliminates ~2-5ms overhead per iteration for non-debug optimization
1331 if tracing::enabled!(tracing::Level::DEBUG) {
1332 let iter_elapsed_ms = iter_start.elapsed().as_secs_f64() * 1000.0;
1333 let total_elapsed_ms = start_time.elapsed().as_secs_f64() * 1000.0;
1334
1335 let stats = IterationStats {
1336 iteration,
1337 cost: state.current_cost,
1338 cost_change: previous_cost - state.current_cost,
1339 gradient_norm: step_result.gradient_norm,
1340 step_norm,
1341 tr_ratio: step_eval.rho,
1342 tr_radius: self.config.damping,
1343 ls_iter: 0, // Direct solver (Cholesky) has no iterations
1344 iter_time_ms: iter_elapsed_ms,
1345 total_time_ms: total_elapsed_ms,
1346 accepted: step_eval.accepted,
1347 };
1348
1349 iteration_stats.push(stats.clone());
1350 stats.print_line();
1351 }
1352
1353 previous_cost = state.current_cost;
1354
1355 // Notify all observers with current state
1356 // First set metrics data, then notify observers
1357 self.observers.set_iteration_metrics(
1358 state.current_cost,
1359 step_result.gradient_norm,
1360 Some(self.config.damping),
1361 step_norm,
1362 Some(step_eval.rho),
1363 );
1364
1365 // Set matrix data if available and there are observers
1366 if !self.observers.is_empty()
1367 && let (Some(hessian), Some(gradient)) =
1368 (linear_solver.get_hessian(), linear_solver.get_gradient())
1369 {
1370 self.observers
1371 .set_matrix_data(Some(hessian.clone()), Some(gradient.clone()));
1372 }
1373
1374 // Notify observers with current variable values and iteration number
1375 self.observers.notify(&state.variables, iteration);
1376
1377 // Check convergence
1378 let elapsed = start_time.elapsed();
1379
1380 // Compute parameter norm for relative parameter tolerance check
1381 let parameter_norm = Self::compute_parameter_norm(&state.variables);
1382
1383 // Compute new cost for convergence check (state may already have new cost if step accepted)
1384 let new_cost = if step_eval.accepted {
1385 state.current_cost
1386 } else {
1387 // Use cost before step application
1388 state.current_cost
1389 };
1390
1391 // Cost before this step (need to add back reduction if step was accepted)
1392 let cost_before_step = if step_eval.accepted {
1393 state.current_cost + step_eval.cost_reduction
1394 } else {
1395 state.current_cost
1396 };
1397
1398 if let Some(status) = self.check_convergence(
1399 iteration,
1400 cost_before_step,
1401 new_cost,
1402 parameter_norm,
1403 step_norm,
1404 step_result.gradient_norm,
1405 self.config.trust_region_radius,
1406 elapsed,
1407 step_eval.accepted,
1408 ) {
1409 let summary = self.create_summary(
1410 state.initial_cost,
1411 state.current_cost,
1412 step_eval.rho,
1413 iteration + 1,
1414 successful_steps,
1415 unsuccessful_steps,
1416 max_gradient_norm,
1417 final_gradient_norm,
1418 max_parameter_update_norm,
1419 final_parameter_update_norm,
1420 total_cost_reduction,
1421 elapsed,
1422 iteration_stats.clone(),
1423 status.clone(),
1424 );
1425
1426 // Print summary only if debug level is enabled
1427 if tracing::enabled!(tracing::Level::DEBUG) {
1428 debug!("{}", summary);
1429 }
1430
1431 // Compute covariances if enabled
1432 let covariances = if self.config.compute_covariances {
1433 problem.compute_and_set_covariances(
1434 &mut linear_solver,
1435 &mut state.variables,
1436 &state.variable_index_map,
1437 )
1438 } else {
1439 None
1440 };
1441
1442 return Ok(SolverResult {
1443 status,
1444 iterations: iteration + 1,
1445 initial_cost: state.initial_cost,
1446 final_cost: state.current_cost,
1447 parameters: state.variables.into_iter().collect(),
1448 elapsed_time: elapsed,
1449 convergence_info: Some(ConvergenceInfo {
1450 final_gradient_norm,
1451 final_parameter_update_norm,
1452 cost_evaluations,
1453 jacobian_evaluations,
1454 }),
1455 covariances,
1456 });
1457 }
1458
1459 // Note: Max iterations and timeout checks are now handled inside check_convergence()
1460
1461 iteration += 1;
1462 }
1463 }
1464}
1465// Implement Solver trait
1466impl Solver for LevenbergMarquardt {
1467 type Config = LevenbergMarquardtConfig;
1468 type Error = error::ApexSolverError;
1469
1470 fn new() -> Self {
1471 Self::default()
1472 }
1473
1474 fn optimize(
1475 &mut self,
1476 problem: &Problem,
1477 initial_params: &HashMap<String, (ManifoldType, DVector<f64>)>,
1478 ) -> Result<SolverResult<HashMap<String, VariableEnum>>, Self::Error> {
1479 self.optimize(problem, initial_params)
1480 }
1481}
1482
1483#[cfg(test)]
1484mod tests {
1485 use super::*;
1486 use crate::factors::Factor;
1487 use nalgebra::{DMatrix, dvector};
1488 /// Custom Rosenbrock Factor 1: r1 = 10(x2 - x1²)
1489 /// Demonstrates extensibility - custom factors can be defined outside of factors.rs
1490 #[derive(Debug, Clone)]
1491 struct RosenbrockFactor1;
1492
1493 impl Factor for RosenbrockFactor1 {
1494 fn linearize(
1495 &self,
1496 params: &[DVector<f64>],
1497 compute_jacobian: bool,
1498 ) -> (DVector<f64>, Option<DMatrix<f64>>) {
1499 let x1 = params[0][0];
1500 let x2 = params[1][0];
1501
1502 // Residual: r1 = 10(x2 - x1²)
1503 let residual = dvector![10.0 * (x2 - x1 * x1)];
1504
1505 let jacobian = if compute_jacobian {
1506 // Jacobian: ∂r1/∂x1 = -20*x1, ∂r1/∂x2 = 10
1507 let mut jacobian = DMatrix::zeros(1, 2);
1508 jacobian[(0, 0)] = -20.0 * x1;
1509 jacobian[(0, 1)] = 10.0;
1510
1511 Some(jacobian)
1512 } else {
1513 None
1514 };
1515
1516 (residual, jacobian)
1517 }
1518
1519 fn get_dimension(&self) -> usize {
1520 1
1521 }
1522 }
1523
1524 /// Custom Rosenbrock Factor 2: r2 = 1 - x1
1525 /// Demonstrates extensibility - custom factors can be defined outside of factors.rs
1526 #[derive(Debug, Clone)]
1527 struct RosenbrockFactor2;
1528
1529 impl Factor for RosenbrockFactor2 {
1530 fn linearize(
1531 &self,
1532 params: &[DVector<f64>],
1533 compute_jacobian: bool,
1534 ) -> (DVector<f64>, Option<DMatrix<f64>>) {
1535 let x1 = params[0][0];
1536
1537 // Residual: r2 = 1 - x1
1538 let residual = dvector![1.0 - x1];
1539
1540 let jacobian = if compute_jacobian {
1541 // Jacobian: ∂r2/∂x1 = -1
1542 Some(DMatrix::from_element(1, 1, -1.0))
1543 } else {
1544 None
1545 };
1546
1547 (residual, jacobian)
1548 }
1549
1550 fn get_dimension(&self) -> usize {
1551 1
1552 }
1553 }
1554
1555 #[test]
1556 fn test_rosenbrock_optimization() -> Result<(), Box<dyn std::error::Error>> {
1557 // Rosenbrock function test:
1558 // Minimize: r1² + r2² where
1559 // r1 = 10(x2 - x1²)
1560 // r2 = 1 - x1
1561 // Starting point: [-1.2, 1.0]
1562 // Expected minimum: [1.0, 1.0]
1563
1564 let mut problem = Problem::new();
1565 let mut initial_values = HashMap::new();
1566
1567 // Add variables using Rn manifold (Euclidean space)
1568 initial_values.insert("x1".to_string(), (ManifoldType::RN, dvector![-1.2]));
1569 initial_values.insert("x2".to_string(), (ManifoldType::RN, dvector![1.0]));
1570
1571 // Add custom factors (demonstrates extensibility!)
1572 problem.add_residual_block(&["x1", "x2"], Box::new(RosenbrockFactor1), None);
1573 problem.add_residual_block(&["x1"], Box::new(RosenbrockFactor2), None);
1574
1575 // Configure Levenberg-Marquardt optimizer
1576 let config = LevenbergMarquardtConfig::new()
1577 .with_max_iterations(100)
1578 .with_cost_tolerance(1e-8)
1579 .with_parameter_tolerance(1e-8)
1580 .with_gradient_tolerance(1e-10);
1581
1582 let mut solver = LevenbergMarquardt::with_config(config);
1583 let result = solver.optimize(&problem, &initial_values)?;
1584
1585 // Extract final values
1586 let x1_final = result
1587 .parameters
1588 .get("x1")
1589 .ok_or("x1 not found")?
1590 .to_vector()[0];
1591 let x2_final = result
1592 .parameters
1593 .get("x2")
1594 .ok_or("x2 not found")?
1595 .to_vector()[0];
1596
1597 // Verify convergence to [1.0, 1.0]
1598 assert!(
1599 matches!(
1600 result.status,
1601 OptimizationStatus::Converged
1602 | OptimizationStatus::CostToleranceReached
1603 | OptimizationStatus::ParameterToleranceReached
1604 | OptimizationStatus::GradientToleranceReached
1605 ),
1606 "Optimization should converge"
1607 );
1608 assert!(
1609 (x1_final - 1.0).abs() < 1e-4,
1610 "x1 should converge to 1.0, got {}",
1611 x1_final
1612 );
1613 assert!(
1614 (x2_final - 1.0).abs() < 1e-4,
1615 "x2 should converge to 1.0, got {}",
1616 x2_final
1617 );
1618 assert!(
1619 result.final_cost < 1e-6,
1620 "Final cost should be near zero, got {}",
1621 result.final_cost
1622 );
1623 Ok(())
1624 }
1625}