apex_solver/optimizer/levenberg_marquardt.rs
1//! Levenberg-Marquardt algorithm implementation.
2//!
3//! The Levenberg-Marquardt (LM) method is a robust and widely-used algorithm for solving
4//! nonlinear least squares problems of the form:
5//!
6//! ```text
7//! min f(x) = ½||r(x)||² = ½Σᵢ rᵢ(x)²
8//! ```
9//!
10//! where `r: ℝⁿ → ℝᵐ` is the residual vector function.
11//!
12//! # Algorithm Overview
13//!
14//! The Levenberg-Marquardt method solves the damped normal equations at each iteration:
15//!
16//! ```text
17//! (J^T·J + λI)·h = -J^T·r
18//! ```
19//!
20//! where:
21//! - `J` is the Jacobian matrix (m × n)
22//! - `r` is the residual vector (m × 1)
23//! - `h` is the step vector (n × 1)
24//! - `λ` is the adaptive damping parameter (scalar)
25//! - `I` is the identity matrix (or diagonal scaling matrix)
26//!
27//! ## Damping Parameter Strategy
28//!
29//! The damping parameter λ adapts based on step quality:
30//!
31//! - **λ → 0** (small damping): Behaves like Gauss-Newton with fast quadratic convergence
32//! - **λ → ∞** (large damping): Behaves like gradient descent with guaranteed descent direction
33//!
34//! This interpolation between Newton and gradient descent provides excellent robustness
35//! while maintaining fast convergence near the solution.
36//!
37//! ## Step Acceptance and Damping Update
38//!
39//! The algorithm evaluates each proposed step using the gain ratio:
40//!
41//! ```text
42//! ρ = (actual reduction) / (predicted reduction)
43//! = [f(xₖ) - f(xₖ + h)] / [f(xₖ) - L(h)]
44//! ```
45//!
46//! where `L(h) = f(xₖ) + h^T·g + ½h^T·H·h` is the local quadratic model.
47//!
48//! **Step acceptance:**
49//! - If `ρ > 0`: Accept step (cost decreased), decrease λ to trust the model more
50//! - If `ρ ≤ 0`: Reject step (cost increased), increase λ to be more conservative
51//!
52//! **Damping update** (Nielsen's formula):
53//! ```text
54//! λₖ₊₁ = λₖ · max(1/3, 1 - (2ρ - 1)³)
55//! ```
56//!
57//! This provides smooth, data-driven adaptation of the damping parameter.
58//!
59//! ## Convergence Properties
60//!
61//! - **Global convergence**: Guaranteed to find a stationary point from any starting guess
62//! - **Local quadratic convergence**: Near the solution, behaves like Gauss-Newton
63//! - **Robust to poor initialization**: Adaptive damping prevents divergence
64//! - **Handles ill-conditioning**: Large λ stabilizes nearly singular Hessian
65//!
66//! ## When to Use
67//!
68//! Levenberg-Marquardt is the best general-purpose choice when:
69//! - Initial parameter guess may be far from the optimum
70//! - Problem conditioning is unknown
71//! - Robustness is prioritized over raw speed
72//! - You want reliable convergence across diverse problem types
73//!
74//! For problems with specific structure, consider:
75//! - [`GaussNewton`](crate::GaussNewton) if well-conditioned with good initialization
76//! - [`DogLeg`](crate::DogLeg) for explicit trust region control
77//!
78//! # Implementation Features
79//!
80//! - **Sparse matrix support**: Efficient handling of large-scale problems via `faer` sparse library
81//! - **Adaptive damping**: Nielsen's formula for smooth parameter adaptation
82//! - **Robust linear solvers**: Cholesky (fast) or QR (stable) factorization
83//! - **Jacobi scaling**: Optional diagonal preconditioning for mixed-scale problems
84//! - **Covariance computation**: Optional uncertainty quantification after convergence
85//! - **Manifold operations**: Native support for optimization on Lie groups (SE2, SE3, SO2, SO3)
86//! - **Comprehensive diagnostics**: Detailed summaries of convergence and performance
87//!
88//! # Mathematical Background
89//!
90//! The augmented Hessian `J^T·J + λI` combines two beneficial properties:
91//!
92//! 1. **Positive definiteness**: Always solvable even when `J^T·J` is singular
93//! 2. **Regularization**: Prevents taking steps in poorly-determined directions
94//!
95//! The trust region interpretation: λ controls an implicit spherical trust region where
96//! larger λ restricts step size, ensuring the linear model remains valid.
97//!
98//! # Examples
99//!
100//! ## Basic usage
101//!
102//! ```no_run
103//! use apex_solver::LevenbergMarquardt;
104//! use apex_solver::core::problem::Problem;
105//! use std::collections::HashMap;
106//!
107//! # fn main() -> Result<(), Box<dyn std::error::Error>> {
108//! let mut problem = Problem::new();
109//! // ... add residual blocks (factors) to problem ...
110//!
111//! let initial_values = HashMap::new();
112//! // ... initialize parameters ...
113//!
114//! let mut solver = LevenbergMarquardt::new();
115//! let result = solver.optimize(&problem, &initial_values)?;
116//!
117//! # Ok(())
118//! # }
119//! ```
120//!
121//! ## Advanced configuration
122//!
123//! ```no_run
124//! use apex_solver::optimizer::levenberg_marquardt::{LevenbergMarquardtConfig, LevenbergMarquardt};
125//! use apex_solver::linalg::LinearSolverType;
126//!
127//! # fn main() {
128//! let config = LevenbergMarquardtConfig::new()
129//! .with_max_iterations(100)
130//! .with_cost_tolerance(1e-6)
131//! .with_damping(1e-3) // Initial damping
132//! .with_damping_bounds(1e-12, 1e12) // Min/max damping
133//! .with_jacobi_scaling(true); // Improve conditioning
134//!
135//! let mut solver = LevenbergMarquardt::with_config(config);
136//! # }
137//! ```
138//!
139//! # References
140//!
141//! - Levenberg, K. (1944). "A Method for the Solution of Certain Non-Linear Problems in Least Squares". *Quarterly of Applied Mathematics*.
142//! - Marquardt, D. W. (1963). "An Algorithm for Least-Squares Estimation of Nonlinear Parameters". *Journal of the Society for Industrial and Applied Mathematics*.
143//! - Madsen, K., Nielsen, H. B., & Tingleff, O. (2004). *Methods for Non-Linear Least Squares Problems* (2nd ed.). Chapter 3.
144//! - Nocedal, J. & Wright, S. (2006). *Numerical Optimization* (2nd ed.). Springer. Chapter 10.
145//! - Nielsen, H. B. (1999). "Damping Parameter in Marquardt's Method". Technical Report IMM-REP-1999-05.
146
147use crate::core::problem::{Problem, SymbolicStructure, VariableEnum};
148use crate::error;
149use crate::linalg::{
150 LinearSolverType, SchurPreconditioner, SchurSolverAdapter, SchurVariant, SparseCholeskySolver,
151 SparseLinearSolver, SparseQRSolver,
152};
153use crate::manifold::ManifoldType;
154use crate::optimizer::{
155 ConvergenceInfo, OptObserverVec, OptimizationStatus, OptimizerError, Solver, SolverResult,
156 apply_negative_parameter_step, apply_parameter_step, compute_cost,
157};
158
159use faer::{
160 Mat,
161 sparse::{SparseColMat, Triplet},
162};
163use nalgebra::DVector;
164use std::collections::HashMap;
165use std::{
166 fmt,
167 fmt::{Display, Formatter},
168 time::{Duration, Instant},
169};
170use tracing::debug;
171
172/// Summary statistics for the Levenberg-Marquardt optimization process.
173#[derive(Debug, Clone)]
174pub struct LevenbergMarquardtSummary {
175 /// Initial cost value
176 pub initial_cost: f64,
177 /// Final cost value
178 pub final_cost: f64,
179 /// Ratio of actual to predicted reduction in cost
180 pub rho: f64,
181 /// Total number of iterations performed
182 pub iterations: usize,
183 /// Number of successful steps (cost decreased)
184 pub successful_steps: usize,
185 /// Number of unsuccessful steps (cost increased, damping increased)
186 pub unsuccessful_steps: usize,
187 /// Final damping parameter value
188 pub final_damping: f64,
189 /// Average cost reduction per iteration
190 pub average_cost_reduction: f64,
191 /// Maximum gradient norm encountered
192 pub max_gradient_norm: f64,
193 /// Final gradient norm
194 pub final_gradient_norm: f64,
195 /// Maximum parameter update norm
196 pub max_parameter_update_norm: f64,
197 /// Final parameter update norm
198 pub final_parameter_update_norm: f64,
199 /// Total time elapsed
200 pub total_time: Duration,
201 /// Average time per iteration
202 pub average_time_per_iteration: Duration,
203 /// Detailed per-iteration statistics history
204 pub iteration_history: Vec<IterationStats>,
205 /// Convergence status
206 pub convergence_status: OptimizationStatus,
207}
208
209impl Display for LevenbergMarquardtSummary {
210 fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
211 // Determine if converged
212 let converged = matches!(
213 self.convergence_status,
214 OptimizationStatus::Converged
215 | OptimizationStatus::CostToleranceReached
216 | OptimizationStatus::GradientToleranceReached
217 | OptimizationStatus::ParameterToleranceReached
218 );
219
220 writeln!(f, "Levenberg-Marquardt Final Result")?;
221
222 // Title with convergence status
223 if converged {
224 writeln!(f, "CONVERGED ({:?})", self.convergence_status)?;
225 } else {
226 writeln!(f, "DIVERGED ({:?})", self.convergence_status)?;
227 }
228
229 writeln!(f)?;
230 writeln!(f, "Cost:")?;
231 writeln!(f, " Initial: {:.6e}", self.initial_cost)?;
232 writeln!(f, " Final: {:.6e}", self.final_cost)?;
233 writeln!(
234 f,
235 " Reduction: {:.6e} ({:.2}%)",
236 self.initial_cost - self.final_cost,
237 100.0 * (self.initial_cost - self.final_cost) / self.initial_cost.max(1e-12)
238 )?;
239 writeln!(f)?;
240 writeln!(f, "Iterations:")?;
241 writeln!(f, " Total: {}", self.iterations)?;
242 writeln!(
243 f,
244 " Successful steps: {} ({:.1}%)",
245 self.successful_steps,
246 100.0 * self.successful_steps as f64 / self.iterations.max(1) as f64
247 )?;
248 writeln!(
249 f,
250 " Unsuccessful steps: {} ({:.1}%)",
251 self.unsuccessful_steps,
252 100.0 * self.unsuccessful_steps as f64 / self.iterations.max(1) as f64
253 )?;
254 writeln!(f)?;
255 writeln!(f, "Gradient:")?;
256 writeln!(f, " Max norm: {:.2e}", self.max_gradient_norm)?;
257 writeln!(f, " Final norm: {:.2e}", self.final_gradient_norm)?;
258 writeln!(f)?;
259 writeln!(f, "Parameter Update:")?;
260 writeln!(f, " Max norm: {:.2e}", self.max_parameter_update_norm)?;
261 writeln!(f, " Final norm: {:.2e}", self.final_parameter_update_norm)?;
262 writeln!(f)?;
263 writeln!(f, "Performance:")?;
264 writeln!(
265 f,
266 " Total time: {:.2}ms",
267 self.total_time.as_secs_f64() * 1000.0
268 )?;
269 writeln!(
270 f,
271 " Average per iteration: {:.2}ms",
272 self.average_time_per_iteration.as_secs_f64() * 1000.0
273 )?;
274
275 Ok(())
276 }
277}
278
279/// Per-iteration statistics for detailed logging (Ceres-style output).
280///
281/// Captures all relevant metrics for each optimization iteration, enabling
282/// detailed analysis and debugging of the optimization process.
283#[derive(Debug, Clone)]
284pub struct IterationStats {
285 /// Iteration number (0-indexed)
286 pub iteration: usize,
287 /// Cost function value at this iteration
288 pub cost: f64,
289 /// Change in cost from previous iteration
290 pub cost_change: f64,
291 /// L2 norm of the gradient (||J^T·r||)
292 pub gradient_norm: f64,
293 /// L2 norm of the parameter update step (||Δx||)
294 pub step_norm: f64,
295 /// Trust region ratio (ρ = actual_reduction / predicted_reduction)
296 pub tr_ratio: f64,
297 /// Trust region radius (damping parameter λ)
298 pub tr_radius: f64,
299 /// Linear solver iterations (0 for direct solvers like Cholesky)
300 pub ls_iter: usize,
301 /// Time taken for this iteration in milliseconds
302 pub iter_time_ms: f64,
303 /// Total elapsed time since optimization started in milliseconds
304 pub total_time_ms: f64,
305 /// Whether the step was accepted (true) or rejected (false)
306 pub accepted: bool,
307}
308
309impl IterationStats {
310 /// Print table header in Ceres-style format
311 pub fn print_header() {
312 debug!(
313 "{:>4} {:>13} {:>13} {:>13} {:>13} {:>11} {:>11} {:>7} {:>11} {:>13} {:>6}",
314 "iter",
315 "cost",
316 "cost_change",
317 "|gradient|",
318 "|step|",
319 "tr_ratio",
320 "tr_radius",
321 "ls_iter",
322 "iter_time",
323 "total_time",
324 "status"
325 );
326 }
327
328 /// Print single iteration line in Ceres-style format with scientific notation
329 pub fn print_line(&self) {
330 let status = if self.iteration == 0 {
331 "-"
332 } else if self.accepted {
333 "✓"
334 } else {
335 "✗"
336 };
337
338 debug!(
339 "{:>4} {:>13.6e} {:>13.2e} {:>13.2e} {:>13.2e} {:>11.2e} {:>11.2e} {:>7} {:>9.2}ms {:>11.2}ms {:>6}",
340 self.iteration,
341 self.cost,
342 self.cost_change,
343 self.gradient_norm,
344 self.step_norm,
345 self.tr_ratio,
346 self.tr_radius,
347 self.ls_iter,
348 self.iter_time_ms,
349 self.total_time_ms,
350 status
351 );
352 }
353}
354
355/// Configuration parameters for the Levenberg-Marquardt optimizer.
356///
357/// Controls the adaptive damping strategy, convergence criteria, and numerical stability
358/// enhancements for the Levenberg-Marquardt algorithm.
359///
360/// # Builder Pattern
361///
362/// All configuration options can be set using the builder pattern:
363///
364/// ```
365/// use apex_solver::optimizer::levenberg_marquardt::LevenbergMarquardtConfig;
366///
367/// let config = LevenbergMarquardtConfig::new()
368/// .with_max_iterations(100)
369/// .with_damping(1e-3)
370/// .with_damping_bounds(1e-12, 1e12)
371/// .with_jacobi_scaling(true);
372/// ```
373///
374/// # Damping Parameter Behavior
375///
376/// The damping parameter λ controls the trade-off between Gauss-Newton and gradient descent:
377///
378/// - **Initial damping** (`damping`): Starting value (default: 1e-4)
379/// - **Damping bounds** (`damping_min`, `damping_max`): Valid range (default: 1e-12 to 1e12)
380/// - **Adaptation**: Automatically adjusted based on step quality using Nielsen's formula
381///
382/// # Convergence Criteria
383///
384/// The optimizer terminates when ANY of the following conditions is met:
385///
386/// - **Cost tolerance**: `|cost_k - cost_{k-1}| < cost_tolerance`
387/// - **Parameter tolerance**: `||step|| < parameter_tolerance`
388/// - **Gradient tolerance**: `||J^T·r|| < gradient_tolerance`
389/// - **Maximum iterations**: `iteration >= max_iterations`
390/// - **Timeout**: `elapsed_time >= timeout`
391///
392/// # See Also
393///
394/// - [`LevenbergMarquardt`] - The solver that uses this configuration
395/// - [`GaussNewtonConfig`](crate::GaussNewtonConfig) - Undamped variant
396/// - [`DogLegConfig`](crate::DogLegConfig) - Trust region alternative
397#[derive(Clone)]
398pub struct LevenbergMarquardtConfig {
399 /// Type of linear solver for the linear systems
400 pub linear_solver_type: LinearSolverType,
401 /// Maximum number of iterations
402 pub max_iterations: usize,
403 /// Convergence tolerance for cost function
404 pub cost_tolerance: f64,
405 /// Convergence tolerance for parameter updates
406 pub parameter_tolerance: f64,
407 /// Convergence tolerance for gradient norm
408 pub gradient_tolerance: f64,
409 /// Timeout duration
410 pub timeout: Option<Duration>,
411 /// Initial damping parameter
412 pub damping: f64,
413 /// Minimum damping parameter
414 pub damping_min: f64,
415 /// Maximum damping parameter
416 pub damping_max: f64,
417 /// Damping increase factor (when step rejected)
418 pub damping_increase_factor: f64,
419 /// Damping decrease factor (when step accepted)
420 pub damping_decrease_factor: f64,
421 /// Damping nu parameter
422 pub damping_nu: f64,
423 /// Trust region radius
424 pub trust_region_radius: f64,
425 /// Minimum step quality for acceptance
426 pub min_step_quality: f64,
427 /// Good step quality threshold
428 pub good_step_quality: f64,
429 /// Minimum diagonal value for regularization
430 pub min_diagonal: f64,
431 /// Maximum diagonal value for regularization
432 pub max_diagonal: f64,
433 /// Minimum objective function cutoff (optional early termination)
434 ///
435 /// If set, optimization terminates when cost falls below this threshold.
436 /// Useful for early stopping when a "good enough" solution is acceptable.
437 ///
438 /// Default: None (disabled)
439 pub min_cost_threshold: Option<f64>,
440 /// Minimum trust region radius before termination
441 ///
442 /// When the trust region radius falls below this value, the optimizer
443 /// terminates as it indicates the search has converged or the problem
444 /// is ill-conditioned. Matches Ceres Solver's min_trust_region_radius.
445 ///
446 /// Default: 1e-32 (Ceres-compatible)
447 pub min_trust_region_radius: f64,
448 /// Maximum condition number for Jacobian matrix (optional check)
449 ///
450 /// If set, the optimizer checks if condition_number(J^T*J) exceeds this
451 /// threshold and terminates with IllConditionedJacobian status.
452 /// Note: Computing condition number is expensive, so this is disabled by default.
453 ///
454 /// Default: None (disabled)
455 pub max_condition_number: Option<f64>,
456 /// Minimum relative cost decrease for step acceptance
457 ///
458 /// Used in computing step quality (rho = actual_reduction / predicted_reduction).
459 /// Steps with rho < min_relative_decrease are rejected. Matches Ceres Solver's
460 /// min_relative_decrease parameter.
461 ///
462 /// Default: 1e-3 (Ceres-compatible)
463 pub min_relative_decrease: f64,
464 /// Use Jacobi column scaling (preconditioning)
465 ///
466 /// When enabled, normalizes Jacobian columns by their L2 norm before solving.
467 /// This can improve convergence for problems with mixed parameter scales
468 /// (e.g., positions in meters + angles in radians) but adds ~5-10% overhead.
469 ///
470 /// Default: false (to avoid performance overhead and faster convergence)
471 pub use_jacobi_scaling: bool,
472 /// Compute per-variable covariance matrices (uncertainty estimation)
473 ///
474 /// When enabled, computes covariance by inverting the Hessian matrix after
475 /// convergence. The full covariance matrix is extracted into per-variable
476 /// blocks stored in both Variable structs and SolverResult.
477 ///
478 /// Default: false (to avoid performance overhead)
479 pub compute_covariances: bool,
480 /// Schur complement solver variant (for bundle adjustment problems)
481 ///
482 /// When using LinearSolverType::SparseSchurComplement, this determines which
483 /// variant of the Schur complement method to use:
484 /// - Sparse: Direct sparse Cholesky factorization (most accurate, moderate speed)
485 /// - Iterative: Preconditioned Conjugate Gradients (memory efficient, good for large problems)
486 /// - PowerSeries: Power series approximation (fastest, less accurate)
487 ///
488 /// Default: Sparse
489 pub schur_variant: SchurVariant,
490 /// Schur complement preconditioner type
491 ///
492 /// Determines the preconditioning strategy for iterative Schur methods:
493 /// - Diagonal: Simple diagonal preconditioner (fast, less effective)
494 /// - BlockDiagonal: Block-diagonal preconditioner (balanced)
495 /// - IncompleteCholesky: Incomplete Cholesky factorization (slower, more effective)
496 ///
497 /// Default: Diagonal
498 pub schur_preconditioner: SchurPreconditioner,
499 // Note: Visualization is now handled via the observer pattern.
500 // Use `solver.add_observer(RerunObserver::new(true)?)` to enable visualization.
501 // This provides cleaner separation of concerns and allows multiple observers.
502}
503
504impl Default for LevenbergMarquardtConfig {
505 fn default() -> Self {
506 Self {
507 linear_solver_type: LinearSolverType::default(),
508 // Ceres Solver default: 50 (changed from 100 for compatibility)
509 max_iterations: 50,
510 // Ceres Solver default: 1e-6 (changed from 1e-8 for compatibility)
511 cost_tolerance: 1e-6,
512 // Ceres Solver default: 1e-8 (unchanged)
513 parameter_tolerance: 1e-8,
514 // Ceres Solver default: 1e-10 (changed from 1e-8 for compatibility)
515 // Note: Typically should be 1e-4 * cost_tolerance per Ceres docs
516 gradient_tolerance: 1e-10,
517 timeout: None,
518 damping: 1e-3, // Increased from 1e-4 for better initial convergence on BA
519 damping_min: 1e-12,
520 damping_max: 1e12,
521 damping_increase_factor: 10.0,
522 damping_decrease_factor: 0.3,
523 damping_nu: 2.0,
524 trust_region_radius: 1e4,
525 min_step_quality: 0.0,
526 good_step_quality: 0.75,
527 min_diagonal: 1e-6,
528 max_diagonal: 1e32,
529 // New Ceres-compatible parameters
530 min_cost_threshold: None,
531 min_trust_region_radius: 1e-32,
532 max_condition_number: None,
533 min_relative_decrease: 1e-3,
534 // Existing parameters
535 // Jacobi scaling disabled by default for Schur solvers (incompatible with block structure)
536 // Enable manually for Cholesky/QR solvers on mixed-scale problems
537 use_jacobi_scaling: false,
538 compute_covariances: false,
539 // Schur complement parameters
540 schur_variant: SchurVariant::default(),
541 schur_preconditioner: SchurPreconditioner::default(),
542 }
543 }
544}
545
546impl LevenbergMarquardtConfig {
547 /// Create a new Levenberg-Marquardt configuration with default values.
548 pub fn new() -> Self {
549 Self::default()
550 }
551
552 /// Set the linear solver type
553 pub fn with_linear_solver_type(mut self, linear_solver_type: LinearSolverType) -> Self {
554 self.linear_solver_type = linear_solver_type;
555 self
556 }
557
558 /// Set the maximum number of iterations
559 pub fn with_max_iterations(mut self, max_iterations: usize) -> Self {
560 self.max_iterations = max_iterations;
561 self
562 }
563
564 /// Set the cost tolerance
565 pub fn with_cost_tolerance(mut self, cost_tolerance: f64) -> Self {
566 self.cost_tolerance = cost_tolerance;
567 self
568 }
569
570 /// Set the parameter tolerance
571 pub fn with_parameter_tolerance(mut self, parameter_tolerance: f64) -> Self {
572 self.parameter_tolerance = parameter_tolerance;
573 self
574 }
575
576 /// Set the gradient tolerance
577 pub fn with_gradient_tolerance(mut self, gradient_tolerance: f64) -> Self {
578 self.gradient_tolerance = gradient_tolerance;
579 self
580 }
581
582 /// Set the timeout duration
583 pub fn with_timeout(mut self, timeout: Duration) -> Self {
584 self.timeout = Some(timeout);
585 self
586 }
587
588 /// Set the initial damping parameter.
589 pub fn with_damping(mut self, damping: f64) -> Self {
590 self.damping = damping;
591 self
592 }
593
594 /// Set the damping parameter bounds.
595 pub fn with_damping_bounds(mut self, min: f64, max: f64) -> Self {
596 self.damping_min = min;
597 self.damping_max = max;
598 self
599 }
600
601 /// Set the damping adjustment factors.
602 pub fn with_damping_factors(mut self, increase: f64, decrease: f64) -> Self {
603 self.damping_increase_factor = increase;
604 self.damping_decrease_factor = decrease;
605 self
606 }
607
608 /// Set the trust region parameters.
609 pub fn with_trust_region(mut self, radius: f64, min_quality: f64, good_quality: f64) -> Self {
610 self.trust_region_radius = radius;
611 self.min_step_quality = min_quality;
612 self.good_step_quality = good_quality;
613 self
614 }
615
616 /// Set minimum objective function cutoff for early termination.
617 ///
618 /// When set, optimization terminates with MinCostThresholdReached status
619 /// if the cost falls below this threshold. Useful for early stopping when
620 /// a "good enough" solution is acceptable.
621 pub fn with_min_cost_threshold(mut self, min_cost: f64) -> Self {
622 self.min_cost_threshold = Some(min_cost);
623 self
624 }
625
626 /// Set minimum trust region radius before termination.
627 ///
628 /// When the trust region radius falls below this value, optimization
629 /// terminates with TrustRegionRadiusTooSmall status.
630 /// Default: 1e-32 (Ceres-compatible)
631 pub fn with_min_trust_region_radius(mut self, min_radius: f64) -> Self {
632 self.min_trust_region_radius = min_radius;
633 self
634 }
635
636 /// Set maximum condition number for Jacobian matrix.
637 ///
638 /// If set, the optimizer checks if condition_number(J^T*J) exceeds this
639 /// threshold and terminates with IllConditionedJacobian status.
640 /// Note: Computing condition number is expensive, disabled by default.
641 pub fn with_max_condition_number(mut self, max_cond: f64) -> Self {
642 self.max_condition_number = Some(max_cond);
643 self
644 }
645
646 /// Set minimum relative cost decrease for step acceptance.
647 ///
648 /// Steps with rho = (actual_reduction / predicted_reduction) below this
649 /// threshold are rejected. Default: 1e-3 (Ceres-compatible)
650 pub fn with_min_relative_decrease(mut self, min_decrease: f64) -> Self {
651 self.min_relative_decrease = min_decrease;
652 self
653 }
654
655 /// Enable or disable Jacobi column scaling (preconditioning).
656 ///
657 /// When enabled, normalizes Jacobian columns by their L2 norm before solving.
658 /// Can improve convergence for mixed-scale problems but adds ~5-10% overhead.
659 pub fn with_jacobi_scaling(mut self, use_jacobi_scaling: bool) -> Self {
660 self.use_jacobi_scaling = use_jacobi_scaling;
661 self
662 }
663
664 /// Enable or disable covariance computation (uncertainty estimation).
665 ///
666 /// When enabled, computes the full covariance matrix by inverting the Hessian
667 /// after convergence, then extracts per-variable covariance blocks.
668 pub fn with_compute_covariances(mut self, compute_covariances: bool) -> Self {
669 self.compute_covariances = compute_covariances;
670 self
671 }
672
673 /// Set Schur complement solver variant
674 pub fn with_schur_variant(mut self, variant: SchurVariant) -> Self {
675 self.schur_variant = variant;
676 self
677 }
678
679 /// Set Schur complement preconditioner
680 pub fn with_schur_preconditioner(mut self, preconditioner: SchurPreconditioner) -> Self {
681 self.schur_preconditioner = preconditioner;
682 self
683 }
684
685 /// Configuration optimized for bundle adjustment problems.
686 ///
687 /// This preset uses settings tuned for large-scale bundle adjustment:
688 /// - **Schur complement solver** with iterative PCG (memory efficient)
689 /// - **Schur-Jacobi preconditioner** (Ceres-style, best PCG convergence)
690 /// - **Moderate initial damping** (1e-3) - not too aggressive
691 /// - **200 max iterations** (BA often needs more iterations for full convergence)
692 /// - **Very tight tolerances** matching Ceres Solver for accurate reconstruction
693 ///
694 /// This configuration matches Ceres Solver's recommended BA settings and
695 /// should achieve similar convergence quality.
696 ///
697 /// # Example
698 ///
699 /// ```
700 /// use apex_solver::optimizer::levenberg_marquardt::LevenbergMarquardtConfig;
701 ///
702 /// let config = LevenbergMarquardtConfig::for_bundle_adjustment();
703 /// ```
704 pub fn for_bundle_adjustment() -> Self {
705 Self::default()
706 .with_linear_solver_type(LinearSolverType::SparseSchurComplement)
707 .with_schur_variant(SchurVariant::Iterative)
708 .with_schur_preconditioner(SchurPreconditioner::SchurJacobi)
709 .with_damping(1e-3) // Moderate initial damping (Ceres default)
710 .with_max_iterations(20) // Reduced for early stop when RMSE < 1px
711 // Match Ceres tolerances for faster convergence
712 .with_cost_tolerance(1e-6) // Ceres function_tolerance (was 1e-12)
713 .with_parameter_tolerance(1e-8) // Ceres parameter_tolerance (was 1e-14)
714 .with_gradient_tolerance(1e-10) // Relaxed (was 1e-16)
715 }
716
717 /// Enable real-time visualization (graphical debugging).
718 ///
719 /// When enabled, optimization progress is logged to a Rerun viewer with:
720 /// - Time series plots of cost, gradient norm, damping, step quality
721 /// - Sparse Hessian matrix visualization as heat map
722 /// - Gradient vector visualization
723 /// - Real-time manifold state updates (for SE2/SE3 problems)
724 ///
725 /// **Note:** Requires the `visualization` feature to be enabled in `Cargo.toml`.
726 /// Use `verbose` for terminal logging.
727 ///
728 /// # Arguments
729 ///
730 /// * `enable` - Whether to enable visualization
731 // Note: with_visualization() method has been removed.
732 // Use the observer pattern instead:
733 // let mut solver = LevenbergMarquardt::with_config(config);
734 // solver.add_observer(RerunObserver::new(true)?);
735 // This provides cleaner separation and allows multiple observers.
736 /// Print configuration parameters (verbose mode only)
737 pub fn print_configuration(&self) {
738 debug!(
739 "Configuration:\n Solver: Levenberg-Marquardt\n Linear solver: {:?}\n Convergence Criteria:\n Max iterations: {}\n Cost tolerance: {:.2e}\n Parameter tolerance: {:.2e}\n Gradient tolerance: {:.2e}\n Timeout: {:?}\n Damping Parameters:\n Initial damping: {:.2e}\n Damping range: [{:.2e}, {:.2e}]\n Increase factor: {:.2}\n Decrease factor: {:.2}\n Trust Region:\n Initial radius: {:.2e}\n Min step quality: {:.2}\n Good step quality: {:.2}\n Numerical Settings:\n Jacobi scaling: {}\n Compute covariances: {}",
740 self.linear_solver_type,
741 self.max_iterations,
742 self.cost_tolerance,
743 self.parameter_tolerance,
744 self.gradient_tolerance,
745 self.timeout,
746 self.damping,
747 self.damping_min,
748 self.damping_max,
749 self.damping_increase_factor,
750 self.damping_decrease_factor,
751 self.trust_region_radius,
752 self.min_step_quality,
753 self.good_step_quality,
754 if self.use_jacobi_scaling {
755 "enabled"
756 } else {
757 "disabled"
758 },
759 if self.compute_covariances {
760 "enabled"
761 } else {
762 "disabled"
763 }
764 );
765 }
766}
767
768/// State for optimization iteration
769struct LinearizerResult {
770 variables: HashMap<String, VariableEnum>,
771 variable_index_map: HashMap<String, usize>,
772 sorted_vars: Vec<String>,
773 symbolic_structure: SymbolicStructure,
774 current_cost: f64,
775 initial_cost: f64,
776}
777
778/// Result from step computation
779struct StepResult {
780 step: Mat<f64>,
781 gradient_norm: f64,
782 predicted_reduction: f64,
783}
784
785/// Result from step evaluation
786struct StepEvaluation {
787 accepted: bool,
788 cost_reduction: f64,
789 rho: f64,
790}
791
792/// Levenberg-Marquardt solver for nonlinear least squares optimization.
793///
794/// Implements the damped Gauss-Newton method with adaptive damping parameter λ that
795/// interpolates between Gauss-Newton and gradient descent based on step quality.
796///
797/// # Algorithm
798///
799/// At each iteration k:
800/// 1. Compute residual `r(xₖ)` and Jacobian `J(xₖ)`
801/// 2. Solve augmented system: `(J^T·J + λI)·h = -J^T·r`
802/// 3. Evaluate step quality: `ρ = (actual reduction) / (predicted reduction)`
803/// 4. If `ρ > 0`: Accept step and update `xₖ₊₁ = xₖ ⊕ h`, decrease λ
804/// 5. If `ρ ≤ 0`: Reject step (keep `xₖ₊₁ = xₖ`), increase λ
805/// 6. Check convergence criteria
806///
807/// The damping parameter λ is updated using Nielsen's smooth formula:
808/// `λₖ₊₁ = λₖ · max(1/3, 1 - (2ρ - 1)³)` for accepted steps,
809/// or `λₖ₊₁ = λₖ · ν` (with increasing ν) for rejected steps.
810///
811/// # Examples
812///
813/// ```no_run
814/// use apex_solver::optimizer::levenberg_marquardt::{LevenbergMarquardtConfig, LevenbergMarquardt};
815/// use apex_solver::linalg::LinearSolverType;
816///
817/// # fn main() {
818/// let config = LevenbergMarquardtConfig::new()
819/// .with_max_iterations(100)
820/// .with_damping(1e-3)
821/// .with_damping_bounds(1e-12, 1e12)
822/// .with_jacobi_scaling(true);
823///
824/// let mut solver = LevenbergMarquardt::with_config(config);
825/// # }
826/// ```
827///
828/// # See Also
829///
830/// - [`LevenbergMarquardtConfig`] - Configuration options
831/// - [`GaussNewton`](crate::GaussNewton) - Undamped variant (faster but less robust)
832/// - [`DogLeg`](crate::DogLeg) - Alternative trust region method
833pub struct LevenbergMarquardt {
834 config: LevenbergMarquardtConfig,
835 jacobi_scaling: Option<SparseColMat<usize, f64>>,
836 observers: OptObserverVec,
837}
838
839impl Default for LevenbergMarquardt {
840 fn default() -> Self {
841 Self::new()
842 }
843}
844
845impl LevenbergMarquardt {
846 /// Create a new Levenberg-Marquardt solver with default configuration.
847 pub fn new() -> Self {
848 Self::with_config(LevenbergMarquardtConfig::default())
849 }
850
851 /// Create a new Levenberg-Marquardt solver with the given configuration.
852 pub fn with_config(config: LevenbergMarquardtConfig) -> Self {
853 Self {
854 config,
855 jacobi_scaling: None,
856 observers: OptObserverVec::new(),
857 }
858 }
859
860 /// Add an observer to monitor optimization progress.
861 ///
862 /// Observers are notified at each iteration with the current variable values.
863 /// This enables real-time visualization, logging, metrics collection, etc.
864 ///
865 /// # Arguments
866 ///
867 /// * `observer` - Any type implementing `OptObserver`
868 ///
869 /// # Examples
870 ///
871 /// ```no_run
872 /// use apex_solver::{LevenbergMarquardt, LevenbergMarquardtConfig};
873 /// # use apex_solver::core::problem::Problem;
874 /// # use std::collections::HashMap;
875 ///
876 /// # fn main() -> Result<(), Box<dyn std::error::Error>> {
877 /// let mut solver = LevenbergMarquardt::new();
878 ///
879 /// #[cfg(feature = "visualization")]
880 /// {
881 /// use apex_solver::observers::RerunObserver;
882 /// let rerun_observer = RerunObserver::new(true)?;
883 /// solver.add_observer(rerun_observer);
884 /// }
885 ///
886 /// // ... optimize ...
887 /// # Ok(())
888 /// # }
889 /// ```
890 pub fn add_observer(&mut self, observer: impl crate::optimizer::OptObserver + 'static) {
891 self.observers.add(observer);
892 }
893
894 /// Update damping parameter based on step quality using trust region approach
895 /// Reference: Introduction to Optimization and Data Fitting
896 /// Algorithm 6.18
897 fn update_damping(&mut self, rho: f64) -> bool {
898 if rho > 0.0 {
899 // Step accepted - decrease damping
900 let coff = 2.0 * rho - 1.0;
901 self.config.damping *= (1.0_f64 / 3.0).max(1.0 - coff * coff * coff);
902 self.config.damping = self.config.damping.max(self.config.damping_min);
903 self.config.damping_nu = 2.0;
904 true
905 } else {
906 // Step rejected - increase damping
907 self.config.damping *= self.config.damping_nu;
908 self.config.damping_nu *= 2.0;
909 self.config.damping = self.config.damping.min(self.config.damping_max);
910 false
911 }
912 }
913
914 /// Compute step quality ratio (actual vs predicted reduction)
915 /// Reference: Introduction to Optimization and Data Fitting
916 /// Reference: Damping parameter in marquardt's method
917 /// Formula 2.2
918 fn compute_step_quality(
919 &self,
920 current_cost: f64,
921 new_cost: f64,
922 predicted_reduction: f64,
923 ) -> f64 {
924 let actual_reduction = current_cost - new_cost;
925 if predicted_reduction.abs() < 1e-15 {
926 if actual_reduction > 0.0 { 1.0 } else { 0.0 }
927 } else {
928 actual_reduction / predicted_reduction
929 }
930 }
931
932 /// Compute predicted cost reduction from linear model
933 /// Standard LM formula: 0.5 * step^T * (damping * step - gradient)
934 fn compute_predicted_reduction(&self, step: &Mat<f64>, gradient: &Mat<f64>) -> f64 {
935 // Standard Levenberg-Marquardt predicted reduction formula
936 // predicted_reduction = -step^T * gradient - 0.5 * step^T * H * step
937 // = 0.5 * step^T * (damping * step - gradient)
938 let diff = self.config.damping * step - gradient;
939 (0.5 * step.transpose() * &diff)[(0, 0)]
940 }
941
942 /// Check convergence criteria
943 /// Check convergence using comprehensive termination criteria.
944 ///
945 /// Implements 9 termination criteria following Ceres Solver standards:
946 ///
947 /// 1. **Gradient Norm (First-Order Optimality)**: ||g||∞ ≤ gradient_tolerance
948 /// 2. **Parameter Change Tolerance**: ||h|| ≤ parameter_tolerance * (||x|| + parameter_tolerance)
949 /// 3. **Function Value Change Tolerance**: |ΔF| < cost_tolerance * F
950 /// 4. **Objective Function Cutoff**: F_new < min_cost_threshold (optional)
951 /// 5. **Trust Region Radius**: radius < min_trust_region_radius
952 /// 6. **Singular/Ill-Conditioned Jacobian**: Detected during linear solve
953 /// 7. **Invalid Numerical Values**: NaN or Inf in cost or parameters
954 /// 8. **Maximum Iterations**: iteration >= max_iterations
955 /// 9. **Timeout**: elapsed >= timeout
956 ///
957 /// # Arguments
958 ///
959 /// * `iteration` - Current iteration number
960 /// * `current_cost` - Cost before applying the step
961 /// * `new_cost` - Cost after applying the step
962 /// * `parameter_norm` - L2 norm of current parameter vector ||x||
963 /// * `parameter_update_norm` - L2 norm of parameter update step ||h||
964 /// * `gradient_norm` - Infinity norm of gradient ||g||∞
965 /// * `trust_region_radius` - Current trust region radius
966 /// * `elapsed` - Elapsed time since optimization start
967 /// * `step_accepted` - Whether the current step was accepted
968 ///
969 /// # Returns
970 ///
971 /// `Some(OptimizationStatus)` if any termination criterion is satisfied, `None` otherwise.
972 ///
973 /// # Design Note
974 /// This internal convergence checker requires multiple scalar metrics for comprehensive
975 /// termination criteria. Grouping into a struct would reduce clarity without performance benefit.
976 #[allow(clippy::too_many_arguments)]
977 fn check_convergence(
978 &self,
979 iteration: usize,
980 current_cost: f64,
981 new_cost: f64,
982 parameter_norm: f64,
983 parameter_update_norm: f64,
984 gradient_norm: f64,
985 trust_region_radius: f64,
986 elapsed: Duration,
987 step_accepted: bool,
988 ) -> Option<OptimizationStatus> {
989 // CRITICAL SAFETY CHECKS (perform first, before convergence checks)
990
991 // CRITERION 7: Invalid Numerical Values (NaN/Inf)
992 // Always check for numerical instability first
993 if !new_cost.is_finite() || !parameter_update_norm.is_finite() || !gradient_norm.is_finite()
994 {
995 return Some(OptimizationStatus::InvalidNumericalValues);
996 }
997
998 // CRITERION 9: Timeout
999 // Check wall-clock time limit
1000 if let Some(timeout) = self.config.timeout
1001 && elapsed >= timeout
1002 {
1003 return Some(OptimizationStatus::Timeout);
1004 }
1005
1006 // CRITERION 8: Maximum Iterations
1007 // Check iteration count limit
1008 if iteration >= self.config.max_iterations {
1009 return Some(OptimizationStatus::MaxIterationsReached);
1010 }
1011 // CONVERGENCE CRITERIA (only check after successful steps)
1012
1013 // Only check convergence criteria after accepted steps
1014 // (rejected steps don't indicate convergence)
1015 if !step_accepted {
1016 return None;
1017 }
1018
1019 // CRITERION 1: Gradient Norm (First-Order Optimality)
1020 // Check if gradient infinity norm is below threshold
1021 // This indicates we're at a critical point (local minimum, saddle, or maximum)
1022 if gradient_norm < self.config.gradient_tolerance {
1023 return Some(OptimizationStatus::GradientToleranceReached);
1024 }
1025
1026 // Only check parameter and cost criteria after first iteration
1027 if iteration > 0 {
1028 // CRITERION 2: Parameter Change Tolerance (xtol)
1029 // Ceres formula: ||h|| ≤ ε_param * (||x|| + ε_param)
1030 // This is a relative measure that scales with parameter magnitude
1031 let relative_step_tolerance = self.config.parameter_tolerance
1032 * (parameter_norm + self.config.parameter_tolerance);
1033
1034 if parameter_update_norm <= relative_step_tolerance {
1035 return Some(OptimizationStatus::ParameterToleranceReached);
1036 }
1037
1038 // CRITERION 3: Function Value Change Tolerance (ftol)
1039 // Ceres formula: |ΔF| < ε_cost * F
1040 // Check relative cost change (not absolute)
1041 let cost_change = (current_cost - new_cost).abs();
1042 let relative_cost_change = cost_change / current_cost.max(1e-10); // Avoid division by zero
1043
1044 if relative_cost_change < self.config.cost_tolerance {
1045 return Some(OptimizationStatus::CostToleranceReached);
1046 }
1047 }
1048
1049 // CRITERION 4: Objective Function Cutoff (optional early stopping)
1050 // Useful for "good enough" solutions
1051 if let Some(min_cost) = self.config.min_cost_threshold
1052 && new_cost < min_cost
1053 {
1054 return Some(OptimizationStatus::MinCostThresholdReached);
1055 }
1056
1057 // CRITERION 5: Trust Region Radius
1058 // If trust region has collapsed, optimization has converged or problem is ill-conditioned
1059 if trust_region_radius < self.config.min_trust_region_radius {
1060 return Some(OptimizationStatus::TrustRegionRadiusTooSmall);
1061 }
1062
1063 // CRITERION 6: Singular/Ill-Conditioned Jacobian
1064 // Note: This is typically detected during the linear solve and handled there
1065 // The max_condition_number check would be expensive to compute here
1066 // If linear solve fails, it returns an error that's converted to NumericalFailure
1067
1068 // No termination criterion satisfied
1069 None
1070 }
1071
1072 /// Compute total parameter vector norm ||x||.
1073 ///
1074 /// Computes the L2 norm of all parameter vectors concatenated together.
1075 /// This is used in the relative parameter tolerance check.
1076 ///
1077 /// # Arguments
1078 ///
1079 /// * `variables` - Map of variable names to their current values
1080 ///
1081 /// # Returns
1082 ///
1083 /// The L2 norm of the concatenated parameter vector
1084 fn compute_parameter_norm(variables: &HashMap<String, VariableEnum>) -> f64 {
1085 variables
1086 .values()
1087 .map(|v| {
1088 let vec = v.to_vector();
1089 vec.norm_squared()
1090 })
1091 .sum::<f64>()
1092 .sqrt()
1093 }
1094
1095 /// Create Jacobi scaling matrix from Jacobian
1096 fn create_jacobi_scaling(
1097 &self,
1098 jacobian: &SparseColMat<usize, f64>,
1099 ) -> Result<SparseColMat<usize, f64>, OptimizerError> {
1100 let cols = jacobian.ncols();
1101 let jacobi_scaling_vec: Vec<Triplet<usize, usize, f64>> = (0..cols)
1102 .map(|c| {
1103 // Compute column norm: sqrt(sum(J_col^2))
1104 let col_norm_squared: f64 = jacobian
1105 .triplet_iter()
1106 .filter(|t| t.col == c)
1107 .map(|t| t.val * t.val)
1108 .sum();
1109 let col_norm = col_norm_squared.sqrt();
1110 // Scaling factor: 1.0 / (1.0 + col_norm)
1111 let scaling = 1.0 / (1.0 + col_norm);
1112 Triplet::new(c, c, scaling)
1113 })
1114 .collect();
1115
1116 SparseColMat::try_new_from_triplets(cols, cols, &jacobi_scaling_vec)
1117 .map_err(|e| OptimizerError::JacobiScalingCreation(e.to_string()).log_with_source(e))
1118 }
1119
1120 /// Initialize optimization state from problem and initial parameters
1121 fn initialize_optimization_state(
1122 &self,
1123 problem: &Problem,
1124 initial_params: &HashMap<String, (ManifoldType, DVector<f64>)>,
1125 ) -> Result<LinearizerResult, error::ApexSolverError> {
1126 // Initialize variables from initial values
1127 let variables = problem.initialize_variables(initial_params);
1128
1129 // Create column mapping for variables
1130 let mut variable_index_map = HashMap::new();
1131 let mut col_offset = 0;
1132 let mut sorted_vars: Vec<String> = variables.keys().cloned().collect();
1133 sorted_vars.sort();
1134
1135 for var_name in &sorted_vars {
1136 variable_index_map.insert(var_name.clone(), col_offset);
1137 col_offset += variables[var_name].get_size();
1138 }
1139
1140 // Build symbolic structure for sparse operations
1141 let symbolic_structure =
1142 problem.build_symbolic_structure(&variables, &variable_index_map, col_offset)?;
1143
1144 // Initial cost evaluation (residual only, no Jacobian needed)
1145 let residual = problem.compute_residual_sparse(&variables)?;
1146 let current_cost = compute_cost(&residual);
1147 let initial_cost = current_cost;
1148
1149 Ok(LinearizerResult {
1150 variables,
1151 variable_index_map,
1152 sorted_vars,
1153 symbolic_structure,
1154 current_cost,
1155 initial_cost,
1156 })
1157 }
1158
1159 /// Process Jacobian by creating and applying Jacobi scaling if enabled
1160 fn process_jacobian(
1161 &mut self,
1162 jacobian: &SparseColMat<usize, f64>,
1163 iteration: usize,
1164 ) -> Result<SparseColMat<usize, f64>, OptimizerError> {
1165 // Create Jacobi scaling on first iteration if enabled
1166 if iteration == 0 {
1167 let scaling = self.create_jacobi_scaling(jacobian)?;
1168 self.jacobi_scaling = Some(scaling);
1169 }
1170 let scaling = self
1171 .jacobi_scaling
1172 .as_ref()
1173 .ok_or_else(|| OptimizerError::JacobiScalingNotInitialized.log())?;
1174 Ok(jacobian * scaling)
1175 }
1176
1177 /// Compute optimization step by solving the augmented system
1178 fn compute_levenberg_marquardt_step(
1179 &self,
1180 residuals: &Mat<f64>,
1181 scaled_jacobian: &SparseColMat<usize, f64>,
1182 linear_solver: &mut Box<dyn SparseLinearSolver>,
1183 ) -> Result<StepResult, OptimizerError> {
1184 // Solve augmented equation: (J_scaled^T * J_scaled + λI) * dx_scaled = -J_scaled^T * r
1185 let residuals_owned = residuals.as_ref().to_owned();
1186 let scaled_step = linear_solver
1187 .solve_augmented_equation(&residuals_owned, scaled_jacobian, self.config.damping)
1188 .map_err(|e| OptimizerError::LinearSolveFailed(e.to_string()).log_with_source(e))?;
1189
1190 // Get cached gradient and Hessian from the solver
1191 let gradient = linear_solver.get_gradient().ok_or_else(|| {
1192 OptimizerError::NumericalInstability("Gradient not available".into()).log()
1193 })?;
1194 let _hessian = linear_solver.get_hessian().ok_or_else(|| {
1195 OptimizerError::NumericalInstability("Hessian not available".into()).log()
1196 })?;
1197 let gradient_norm = gradient.norm_l2();
1198
1199 // Apply inverse Jacobi scaling to get final step (if enabled)
1200 let step = if self.config.use_jacobi_scaling {
1201 let scaling = self
1202 .jacobi_scaling
1203 .as_ref()
1204 .ok_or_else(|| OptimizerError::JacobiScalingNotInitialized.log())?;
1205 &scaled_step * scaling
1206 } else {
1207 scaled_step
1208 };
1209
1210 // Compute predicted reduction using scaled values
1211 let predicted_reduction = self.compute_predicted_reduction(&step, gradient);
1212
1213 Ok(StepResult {
1214 step,
1215 gradient_norm,
1216 predicted_reduction,
1217 })
1218 }
1219
1220 /// Evaluate and apply step, handling acceptance/rejection based on step quality
1221 fn evaluate_and_apply_step(
1222 &mut self,
1223 step_result: &StepResult,
1224 state: &mut LinearizerResult,
1225 problem: &Problem,
1226 ) -> error::ApexSolverResult<StepEvaluation> {
1227 // Apply parameter updates using manifold operations
1228 let _step_norm = apply_parameter_step(
1229 &mut state.variables,
1230 step_result.step.as_ref(),
1231 &state.sorted_vars,
1232 );
1233
1234 // Compute new cost (residual only, no Jacobian needed for step evaluation)
1235 let new_residual = problem.compute_residual_sparse(&state.variables)?;
1236 let new_cost = compute_cost(&new_residual);
1237
1238 // Compute step quality
1239 let rho = self.compute_step_quality(
1240 state.current_cost,
1241 new_cost,
1242 step_result.predicted_reduction,
1243 );
1244
1245 // Update damping and decide whether to accept step
1246 let accepted = self.update_damping(rho);
1247
1248 let cost_reduction = if accepted {
1249 // Accept the step - parameters already updated
1250 let reduction = state.current_cost - new_cost;
1251 state.current_cost = new_cost;
1252 reduction
1253 } else {
1254 // Reject the step - revert parameter changes
1255 apply_negative_parameter_step(
1256 &mut state.variables,
1257 step_result.step.as_ref(),
1258 &state.sorted_vars,
1259 );
1260 0.0
1261 };
1262
1263 Ok(StepEvaluation {
1264 accepted,
1265 cost_reduction,
1266 rho,
1267 })
1268 }
1269
1270 /// Create optimization summary
1271 ///
1272 /// # Design Note
1273 /// This internal summary builder accepts individual scalar results from the optimization loop.
1274 /// A struct parameter would not improve readability for this private method.
1275 #[allow(clippy::too_many_arguments)]
1276 fn create_summary(
1277 &self,
1278 initial_cost: f64,
1279 final_cost: f64,
1280 rho: f64,
1281 iterations: usize,
1282 successful_steps: usize,
1283 unsuccessful_steps: usize,
1284 max_gradient_norm: f64,
1285 final_gradient_norm: f64,
1286 max_parameter_update_norm: f64,
1287 final_parameter_update_norm: f64,
1288 total_cost_reduction: f64,
1289 total_time: Duration,
1290 iteration_history: Vec<IterationStats>,
1291 convergence_status: OptimizationStatus,
1292 ) -> LevenbergMarquardtSummary {
1293 LevenbergMarquardtSummary {
1294 initial_cost,
1295 final_cost,
1296 rho,
1297 iterations,
1298 successful_steps,
1299 unsuccessful_steps,
1300 final_damping: self.config.damping,
1301 average_cost_reduction: if iterations > 0 {
1302 total_cost_reduction / iterations as f64
1303 } else {
1304 0.0
1305 },
1306 max_gradient_norm,
1307 final_gradient_norm,
1308 max_parameter_update_norm,
1309 final_parameter_update_norm,
1310 total_time,
1311 average_time_per_iteration: if iterations > 0 {
1312 total_time / iterations as u32
1313 } else {
1314 Duration::from_secs(0)
1315 },
1316 iteration_history,
1317 convergence_status,
1318 }
1319 }
1320
1321 pub fn optimize(
1322 &mut self,
1323 problem: &Problem,
1324 initial_params: &HashMap<String, (ManifoldType, DVector<f64>)>,
1325 ) -> Result<SolverResult<HashMap<String, VariableEnum>>, error::ApexSolverError> {
1326 let start_time = Instant::now();
1327 let mut iteration = 0;
1328 let mut cost_evaluations = 1; // Initial cost evaluation
1329 let mut jacobian_evaluations = 0;
1330 let mut successful_steps = 0;
1331 let mut unsuccessful_steps = 0;
1332
1333 // Initialize optimization state
1334 let mut state = self.initialize_optimization_state(problem, initial_params)?;
1335
1336 // Create linear solver - must be after variable initialization for Schur solver
1337 let mut linear_solver: Box<dyn SparseLinearSolver> = match self.config.linear_solver_type {
1338 LinearSolverType::SparseCholesky => Box::new(SparseCholeskySolver::new()),
1339 LinearSolverType::SparseQR => Box::new(SparseQRSolver::new()),
1340 LinearSolverType::SparseSchurComplement => Box::new(
1341 SchurSolverAdapter::new_with_structure_and_config(
1342 &state.variables,
1343 &state.variable_index_map,
1344 self.config.schur_variant,
1345 self.config.schur_preconditioner,
1346 )
1347 .map_err(|e| {
1348 OptimizerError::LinearSolveFailed(format!(
1349 "Failed to initialize Schur solver: {}",
1350 e
1351 ))
1352 .log()
1353 })?,
1354 ),
1355 };
1356
1357 // Initialize summary tracking variables
1358 let mut max_gradient_norm: f64 = 0.0;
1359 let mut max_parameter_update_norm: f64 = 0.0;
1360 let mut total_cost_reduction = 0.0;
1361 let mut final_gradient_norm;
1362 let mut final_parameter_update_norm;
1363
1364 // Initialize iteration statistics tracking
1365 let mut iteration_stats = Vec::with_capacity(self.config.max_iterations);
1366 let mut previous_cost = state.current_cost;
1367
1368 // Print configuration and header if debug level is enabled
1369 if tracing::enabled!(tracing::Level::DEBUG) {
1370 self.config.print_configuration();
1371 IterationStats::print_header();
1372 }
1373
1374 // Main optimization loop
1375 loop {
1376 let iter_start = Instant::now();
1377 // Evaluate residuals and Jacobian
1378 let (residuals, jacobian) = problem.compute_residual_and_jacobian_sparse(
1379 &state.variables,
1380 &state.variable_index_map,
1381 &state.symbolic_structure,
1382 )?;
1383 jacobian_evaluations += 1;
1384
1385 // Process Jacobian (apply scaling if enabled)
1386 let scaled_jacobian = if self.config.use_jacobi_scaling {
1387 self.process_jacobian(&jacobian, iteration)?
1388 } else {
1389 jacobian
1390 };
1391
1392 // Compute optimization step
1393 let step_result = self.compute_levenberg_marquardt_step(
1394 &residuals,
1395 &scaled_jacobian,
1396 &mut linear_solver,
1397 )?;
1398
1399 // Update tracking variables
1400 max_gradient_norm = max_gradient_norm.max(step_result.gradient_norm);
1401 final_gradient_norm = step_result.gradient_norm;
1402 let step_norm = step_result.step.norm_l2();
1403 max_parameter_update_norm = max_parameter_update_norm.max(step_norm);
1404 final_parameter_update_norm = step_norm;
1405
1406 // Evaluate and apply step (handles accept/reject)
1407 let step_eval = self.evaluate_and_apply_step(&step_result, &mut state, problem)?;
1408 cost_evaluations += 1;
1409
1410 // Update counters based on acceptance
1411 if step_eval.accepted {
1412 successful_steps += 1;
1413 total_cost_reduction += step_eval.cost_reduction;
1414 } else {
1415 unsuccessful_steps += 1;
1416 }
1417
1418 // OPTIMIZATION: Only collect iteration statistics if debug level is enabled
1419 // This eliminates ~2-5ms overhead per iteration for non-debug optimization
1420 if tracing::enabled!(tracing::Level::DEBUG) {
1421 let iter_elapsed_ms = iter_start.elapsed().as_secs_f64() * 1000.0;
1422 let total_elapsed_ms = start_time.elapsed().as_secs_f64() * 1000.0;
1423
1424 let stats = IterationStats {
1425 iteration,
1426 cost: state.current_cost,
1427 cost_change: previous_cost - state.current_cost,
1428 gradient_norm: step_result.gradient_norm,
1429 step_norm,
1430 tr_ratio: step_eval.rho,
1431 tr_radius: self.config.damping,
1432 ls_iter: 0, // Direct solver (Cholesky) has no iterations
1433 iter_time_ms: iter_elapsed_ms,
1434 total_time_ms: total_elapsed_ms,
1435 accepted: step_eval.accepted,
1436 };
1437
1438 iteration_stats.push(stats.clone());
1439 stats.print_line();
1440 }
1441
1442 previous_cost = state.current_cost;
1443
1444 // Notify all observers with current state
1445 // First set metrics data, then notify observers
1446 self.observers.set_iteration_metrics(
1447 state.current_cost,
1448 step_result.gradient_norm,
1449 Some(self.config.damping),
1450 step_norm,
1451 Some(step_eval.rho),
1452 );
1453
1454 // Set matrix data if available and there are observers
1455 if !self.observers.is_empty()
1456 && let (Some(hessian), Some(gradient)) =
1457 (linear_solver.get_hessian(), linear_solver.get_gradient())
1458 {
1459 self.observers
1460 .set_matrix_data(Some(hessian.clone()), Some(gradient.clone()));
1461 }
1462
1463 // Notify observers with current variable values and iteration number
1464 self.observers.notify(&state.variables, iteration);
1465
1466 // Check convergence
1467 let elapsed = start_time.elapsed();
1468
1469 // Compute parameter norm for relative parameter tolerance check
1470 let parameter_norm = Self::compute_parameter_norm(&state.variables);
1471
1472 // Compute new cost for convergence check (state may already have new cost if step accepted)
1473 let new_cost = if step_eval.accepted {
1474 state.current_cost
1475 } else {
1476 // Use cost before step application
1477 state.current_cost
1478 };
1479
1480 // Cost before this step (need to add back reduction if step was accepted)
1481 let cost_before_step = if step_eval.accepted {
1482 state.current_cost + step_eval.cost_reduction
1483 } else {
1484 state.current_cost
1485 };
1486
1487 if let Some(status) = self.check_convergence(
1488 iteration,
1489 cost_before_step,
1490 new_cost,
1491 parameter_norm,
1492 step_norm,
1493 step_result.gradient_norm,
1494 self.config.trust_region_radius,
1495 elapsed,
1496 step_eval.accepted,
1497 ) {
1498 let summary = self.create_summary(
1499 state.initial_cost,
1500 state.current_cost,
1501 step_eval.rho,
1502 iteration + 1,
1503 successful_steps,
1504 unsuccessful_steps,
1505 max_gradient_norm,
1506 final_gradient_norm,
1507 max_parameter_update_norm,
1508 final_parameter_update_norm,
1509 total_cost_reduction,
1510 elapsed,
1511 iteration_stats.clone(),
1512 status.clone(),
1513 );
1514
1515 // Print summary only if debug level is enabled
1516 if tracing::enabled!(tracing::Level::DEBUG) {
1517 debug!("{}", summary);
1518 }
1519
1520 // Compute covariances if enabled
1521 let covariances = if self.config.compute_covariances {
1522 problem.compute_and_set_covariances(
1523 &mut linear_solver,
1524 &mut state.variables,
1525 &state.variable_index_map,
1526 )
1527 } else {
1528 None
1529 };
1530
1531 // Convert to HashMap for result
1532 let final_parameters: HashMap<String, VariableEnum> =
1533 state.variables.into_iter().collect();
1534
1535 // Notify observers that optimization is complete
1536 self.observers
1537 .notify_complete(&final_parameters, iteration + 1);
1538
1539 return Ok(SolverResult {
1540 status,
1541 iterations: iteration + 1,
1542 initial_cost: state.initial_cost,
1543 final_cost: state.current_cost,
1544 parameters: final_parameters,
1545 elapsed_time: elapsed,
1546 convergence_info: Some(ConvergenceInfo {
1547 final_gradient_norm,
1548 final_parameter_update_norm,
1549 cost_evaluations,
1550 jacobian_evaluations,
1551 }),
1552 covariances,
1553 });
1554 }
1555
1556 // Note: Max iterations and timeout checks are now handled inside check_convergence()
1557
1558 iteration += 1;
1559 }
1560 }
1561}
1562// Implement Solver trait
1563impl Solver for LevenbergMarquardt {
1564 type Config = LevenbergMarquardtConfig;
1565 type Error = error::ApexSolverError;
1566
1567 fn new() -> Self {
1568 Self::default()
1569 }
1570
1571 fn optimize(
1572 &mut self,
1573 problem: &Problem,
1574 initial_params: &HashMap<String, (ManifoldType, DVector<f64>)>,
1575 ) -> Result<SolverResult<HashMap<String, VariableEnum>>, Self::Error> {
1576 self.optimize(problem, initial_params)
1577 }
1578}
1579
1580#[cfg(test)]
1581mod tests {
1582 use super::*;
1583 use crate::factors::Factor;
1584 use nalgebra::{DMatrix, dvector};
1585 /// Custom Rosenbrock Factor 1: r1 = 10(x2 - x1²)
1586 /// Demonstrates extensibility - custom factors can be defined outside of factors.rs
1587 #[derive(Debug, Clone)]
1588 struct RosenbrockFactor1;
1589
1590 impl Factor for RosenbrockFactor1 {
1591 fn linearize(
1592 &self,
1593 params: &[DVector<f64>],
1594 compute_jacobian: bool,
1595 ) -> (DVector<f64>, Option<DMatrix<f64>>) {
1596 let x1 = params[0][0];
1597 let x2 = params[1][0];
1598
1599 // Residual: r1 = 10(x2 - x1²)
1600 let residual = dvector![10.0 * (x2 - x1 * x1)];
1601
1602 let jacobian = if compute_jacobian {
1603 // Jacobian: ∂r1/∂x1 = -20*x1, ∂r1/∂x2 = 10
1604 let mut jacobian = DMatrix::zeros(1, 2);
1605 jacobian[(0, 0)] = -20.0 * x1;
1606 jacobian[(0, 1)] = 10.0;
1607
1608 Some(jacobian)
1609 } else {
1610 None
1611 };
1612
1613 (residual, jacobian)
1614 }
1615
1616 fn get_dimension(&self) -> usize {
1617 1
1618 }
1619 }
1620
1621 /// Custom Rosenbrock Factor 2: r2 = 1 - x1
1622 /// Demonstrates extensibility - custom factors can be defined outside of factors.rs
1623 #[derive(Debug, Clone)]
1624 struct RosenbrockFactor2;
1625
1626 impl Factor for RosenbrockFactor2 {
1627 fn linearize(
1628 &self,
1629 params: &[DVector<f64>],
1630 compute_jacobian: bool,
1631 ) -> (DVector<f64>, Option<DMatrix<f64>>) {
1632 let x1 = params[0][0];
1633
1634 // Residual: r2 = 1 - x1
1635 let residual = dvector![1.0 - x1];
1636
1637 let jacobian = if compute_jacobian {
1638 // Jacobian: ∂r2/∂x1 = -1
1639 Some(DMatrix::from_element(1, 1, -1.0))
1640 } else {
1641 None
1642 };
1643
1644 (residual, jacobian)
1645 }
1646
1647 fn get_dimension(&self) -> usize {
1648 1
1649 }
1650 }
1651
1652 #[test]
1653 fn test_rosenbrock_optimization() -> Result<(), Box<dyn std::error::Error>> {
1654 // Rosenbrock function test:
1655 // Minimize: r1² + r2² where
1656 // r1 = 10(x2 - x1²)
1657 // r2 = 1 - x1
1658 // Starting point: [-1.2, 1.0]
1659 // Expected minimum: [1.0, 1.0]
1660
1661 let mut problem = Problem::new();
1662 let mut initial_values = HashMap::new();
1663
1664 // Add variables using Rn manifold (Euclidean space)
1665 initial_values.insert("x1".to_string(), (ManifoldType::RN, dvector![-1.2]));
1666 initial_values.insert("x2".to_string(), (ManifoldType::RN, dvector![1.0]));
1667
1668 // Add custom factors (demonstrates extensibility!)
1669 problem.add_residual_block(&["x1", "x2"], Box::new(RosenbrockFactor1), None);
1670 problem.add_residual_block(&["x1"], Box::new(RosenbrockFactor2), None);
1671
1672 // Configure Levenberg-Marquardt optimizer
1673 let config = LevenbergMarquardtConfig::new()
1674 .with_max_iterations(100)
1675 .with_cost_tolerance(1e-8)
1676 .with_parameter_tolerance(1e-8)
1677 .with_gradient_tolerance(1e-10);
1678
1679 let mut solver = LevenbergMarquardt::with_config(config);
1680 let result = solver.optimize(&problem, &initial_values)?;
1681
1682 // Extract final values
1683 let x1_final = result
1684 .parameters
1685 .get("x1")
1686 .ok_or("x1 not found")?
1687 .to_vector()[0];
1688 let x2_final = result
1689 .parameters
1690 .get("x2")
1691 .ok_or("x2 not found")?
1692 .to_vector()[0];
1693
1694 // Verify convergence to [1.0, 1.0]
1695 assert!(
1696 matches!(
1697 result.status,
1698 OptimizationStatus::Converged
1699 | OptimizationStatus::CostToleranceReached
1700 | OptimizationStatus::ParameterToleranceReached
1701 | OptimizationStatus::GradientToleranceReached
1702 ),
1703 "Optimization should converge"
1704 );
1705 assert!(
1706 (x1_final - 1.0).abs() < 1e-4,
1707 "x1 should converge to 1.0, got {}",
1708 x1_final
1709 );
1710 assert!(
1711 (x2_final - 1.0).abs() < 1e-4,
1712 "x2 should converge to 1.0, got {}",
1713 x2_final
1714 );
1715 assert!(
1716 result.final_cost < 1e-6,
1717 "Final cost should be near zero, got {}",
1718 result.final_cost
1719 );
1720 Ok(())
1721 }
1722}