apex_solver/optimizer/
levenberg_marquardt.rs

1//! Levenberg-Marquardt algorithm implementation.
2//!
3//! The Levenberg-Marquardt (LM) method is a robust and widely-used algorithm for solving
4//! nonlinear least squares problems of the form:
5//!
6//! ```text
7//! min f(x) = ½||r(x)||² = ½Σᵢ rᵢ(x)²
8//! ```
9//!
10//! where `r: ℝⁿ → ℝᵐ` is the residual vector function.
11//!
12//! # Algorithm Overview
13//!
14//! The Levenberg-Marquardt method solves the damped normal equations at each iteration:
15//!
16//! ```text
17//! (J^T·J + λI)·h = -J^T·r
18//! ```
19//!
20//! where:
21//! - `J` is the Jacobian matrix (m × n)
22//! - `r` is the residual vector (m × 1)
23//! - `h` is the step vector (n × 1)
24//! - `λ` is the adaptive damping parameter (scalar)
25//! - `I` is the identity matrix (or diagonal scaling matrix)
26//!
27//! ## Damping Parameter Strategy
28//!
29//! The damping parameter λ adapts based on step quality:
30//!
31//! - **λ → 0** (small damping): Behaves like Gauss-Newton with fast quadratic convergence
32//! - **λ → ∞** (large damping): Behaves like gradient descent with guaranteed descent direction
33//!
34//! This interpolation between Newton and gradient descent provides excellent robustness
35//! while maintaining fast convergence near the solution.
36//!
37//! ## Step Acceptance and Damping Update
38//!
39//! The algorithm evaluates each proposed step using the gain ratio:
40//!
41//! ```text
42//! ρ = (actual reduction) / (predicted reduction)
43//!   = [f(xₖ) - f(xₖ + h)] / [f(xₖ) - L(h)]
44//! ```
45//!
46//! where `L(h) = f(xₖ) + h^T·g + ½h^T·H·h` is the local quadratic model.
47//!
48//! **Step acceptance:**
49//! - If `ρ > 0`: Accept step (cost decreased), decrease λ to trust the model more
50//! - If `ρ ≤ 0`: Reject step (cost increased), increase λ to be more conservative
51//!
52//! **Damping update** (Nielsen's formula):
53//! ```text
54//! λₖ₊₁ = λₖ · max(1/3, 1 - (2ρ - 1)³)
55//! ```
56//!
57//! This provides smooth, data-driven adaptation of the damping parameter.
58//!
59//! ## Convergence Properties
60//!
61//! - **Global convergence**: Guaranteed to find a stationary point from any starting guess
62//! - **Local quadratic convergence**: Near the solution, behaves like Gauss-Newton
63//! - **Robust to poor initialization**: Adaptive damping prevents divergence
64//! - **Handles ill-conditioning**: Large λ stabilizes nearly singular Hessian
65//!
66//! ## When to Use
67//!
68//! Levenberg-Marquardt is the best general-purpose choice when:
69//! - Initial parameter guess may be far from the optimum
70//! - Problem conditioning is unknown
71//! - Robustness is prioritized over raw speed
72//! - You want reliable convergence across diverse problem types
73//!
74//! For problems with specific structure, consider:
75//! - [`GaussNewton`](crate::GaussNewton) if well-conditioned with good initialization
76//! - [`DogLeg`](crate::DogLeg) for explicit trust region control
77//!
78//! # Implementation Features
79//!
80//! - **Sparse matrix support**: Efficient handling of large-scale problems via `faer` sparse library
81//! - **Adaptive damping**: Nielsen's formula for smooth parameter adaptation
82//! - **Robust linear solvers**: Cholesky (fast) or QR (stable) factorization
83//! - **Jacobi scaling**: Optional diagonal preconditioning for mixed-scale problems
84//! - **Covariance computation**: Optional uncertainty quantification after convergence
85//! - **Manifold operations**: Native support for optimization on Lie groups (SE2, SE3, SO2, SO3)
86//! - **Comprehensive diagnostics**: Detailed summaries of convergence and performance
87//!
88//! # Mathematical Background
89//!
90//! The augmented Hessian `J^T·J + λI` combines two beneficial properties:
91//!
92//! 1. **Positive definiteness**: Always solvable even when `J^T·J` is singular
93//! 2. **Regularization**: Prevents taking steps in poorly-determined directions
94//!
95//! The trust region interpretation: λ controls an implicit spherical trust region where
96//! larger λ restricts step size, ensuring the linear model remains valid.
97//!
98//! # Examples
99//!
100//! ## Basic usage
101//!
102//! ```no_run
103//! use apex_solver::LevenbergMarquardt;
104//! use apex_solver::core::problem::Problem;
105//! use std::collections::HashMap;
106//!
107//! # fn main() -> Result<(), Box<dyn std::error::Error>> {
108//! let mut problem = Problem::new();
109//! // ... add residual blocks (factors) to problem ...
110//!
111//! let initial_values = HashMap::new();
112//! // ... initialize parameters ...
113//!
114//! let mut solver = LevenbergMarquardt::new();
115//! let result = solver.optimize(&problem, &initial_values)?;
116//!
117//! println!("Status: {:?}", result.status);
118//! println!("Final cost: {:.6e}", result.final_cost);
119//! println!("Iterations: {}", result.iterations);
120//! # Ok(())
121//! # }
122//! ```
123//!
124//! ## Advanced configuration
125//!
126//! ```no_run
127//! use apex_solver::optimizer::levenberg_marquardt::{LevenbergMarquardtConfig, LevenbergMarquardt};
128//! use apex_solver::linalg::LinearSolverType;
129//!
130//! # fn main() {
131//! let config = LevenbergMarquardtConfig::new()
132//!     .with_max_iterations(100)
133//!     .with_cost_tolerance(1e-6)
134//!     .with_damping(1e-3)  // Initial damping
135//!     .with_damping_bounds(1e-12, 1e12)  // Min/max damping
136//!     .with_jacobi_scaling(true)  // Improve conditioning
137//!     .with_verbose(true);
138//!
139//! let mut solver = LevenbergMarquardt::with_config(config);
140//! # }
141//! ```
142//!
143//! # References
144//!
145//! - Levenberg, K. (1944). "A Method for the Solution of Certain Non-Linear Problems in Least Squares". *Quarterly of Applied Mathematics*.
146//! - Marquardt, D. W. (1963). "An Algorithm for Least-Squares Estimation of Nonlinear Parameters". *Journal of the Society for Industrial and Applied Mathematics*.
147//! - Madsen, K., Nielsen, H. B., & Tingleff, O. (2004). *Methods for Non-Linear Least Squares Problems* (2nd ed.). Chapter 3.
148//! - Nocedal, J. & Wright, S. (2006). *Numerical Optimization* (2nd ed.). Springer. Chapter 10.
149//! - Nielsen, H. B. (1999). "Damping Parameter in Marquardt's Method". Technical Report IMM-REP-1999-05.
150
151use crate::core::problem::{Problem, SymbolicStructure, VariableEnum};
152use crate::error;
153use crate::linalg::{LinearSolverType, SparseCholeskySolver, SparseLinearSolver, SparseQRSolver};
154use crate::manifold::ManifoldType;
155use crate::optimizer::{
156    ConvergenceInfo, OptimizationStatus, OptimizationVisualizer, Solver, SolverResult,
157    apply_negative_parameter_step, apply_parameter_step, compute_cost,
158};
159use faer::{
160    Mat,
161    sparse::{SparseColMat, Triplet},
162};
163use nalgebra::DVector;
164use std::collections::HashMap;
165use std::{
166    fmt,
167    fmt::{Display, Formatter},
168    time::{Duration, Instant},
169};
170
171/// Summary statistics for the Levenberg-Marquardt optimization process.
172#[derive(Debug, Clone)]
173pub struct LevenbergMarquardtSummary {
174    /// Initial cost value
175    pub initial_cost: f64,
176    /// Final cost value
177    pub final_cost: f64,
178    /// Total number of iterations performed
179    pub iterations: usize,
180    /// Number of successful steps (cost decreased)
181    pub successful_steps: usize,
182    /// Number of unsuccessful steps (cost increased, damping increased)
183    pub unsuccessful_steps: usize,
184    /// Final damping parameter value
185    pub final_damping: f64,
186    /// Average cost reduction per iteration
187    pub average_cost_reduction: f64,
188    /// Maximum gradient norm encountered
189    pub max_gradient_norm: f64,
190    /// Final gradient norm
191    pub final_gradient_norm: f64,
192    /// Maximum parameter update norm
193    pub max_parameter_update_norm: f64,
194    /// Final parameter update norm
195    pub final_parameter_update_norm: f64,
196    /// Total time elapsed
197    pub total_time: Duration,
198    /// Average time per iteration
199    pub average_time_per_iteration: Duration,
200}
201
202impl Display for LevenbergMarquardtSummary {
203    fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
204        writeln!(f, "Levenberg-Marquardt Optimization Summary")?;
205        writeln!(f, "Initial cost:              {:.6e}", self.initial_cost)?;
206        writeln!(f, "Final cost:                {:.6e}", self.final_cost)?;
207        writeln!(
208            f,
209            "Cost reduction:            {:.6e} ({:.2}%)",
210            self.initial_cost - self.final_cost,
211            100.0 * (self.initial_cost - self.final_cost) / self.initial_cost.max(1e-12)
212        )?;
213        writeln!(f, "Total iterations:          {}", self.iterations)?;
214        writeln!(
215            f,
216            "Successful steps:          {} ({:.1}%)",
217            self.successful_steps,
218            100.0 * self.successful_steps as f64 / self.iterations.max(1) as f64
219        )?;
220        writeln!(
221            f,
222            "Unsuccessful steps:        {} ({:.1}%)",
223            self.unsuccessful_steps,
224            100.0 * self.unsuccessful_steps as f64 / self.iterations.max(1) as f64
225        )?;
226        writeln!(f, "Final damping parameter:   {:.6e}", self.final_damping)?;
227        writeln!(
228            f,
229            "Average cost reduction:    {:.6e}",
230            self.average_cost_reduction
231        )?;
232        writeln!(
233            f,
234            "Max gradient norm:         {:.6e}",
235            self.max_gradient_norm
236        )?;
237        writeln!(
238            f,
239            "Final gradient norm:       {:.6e}",
240            self.final_gradient_norm
241        )?;
242        writeln!(
243            f,
244            "Max parameter update norm: {:.6e}",
245            self.max_parameter_update_norm
246        )?;
247        writeln!(
248            f,
249            "Final param update norm:   {:.6e}",
250            self.final_parameter_update_norm
251        )?;
252        writeln!(f, "Total time:                {:?}", self.total_time)?;
253        writeln!(
254            f,
255            "Average time per iteration: {:?}",
256            self.average_time_per_iteration
257        )?;
258        Ok(())
259    }
260}
261
262/// Configuration parameters for the Levenberg-Marquardt optimizer.
263///
264/// Controls the adaptive damping strategy, convergence criteria, and numerical stability
265/// enhancements for the Levenberg-Marquardt algorithm.
266///
267/// # Builder Pattern
268///
269/// All configuration options can be set using the builder pattern:
270///
271/// ```
272/// use apex_solver::optimizer::levenberg_marquardt::LevenbergMarquardtConfig;
273///
274/// let config = LevenbergMarquardtConfig::new()
275///     .with_max_iterations(100)
276///     .with_damping(1e-3)
277///     .with_damping_bounds(1e-12, 1e12)
278///     .with_jacobi_scaling(true)
279///     .with_verbose(true);
280/// ```
281///
282/// # Damping Parameter Behavior
283///
284/// The damping parameter λ controls the trade-off between Gauss-Newton and gradient descent:
285///
286/// - **Initial damping** (`damping`): Starting value (default: 1e-4)
287/// - **Damping bounds** (`damping_min`, `damping_max`): Valid range (default: 1e-12 to 1e12)
288/// - **Adaptation**: Automatically adjusted based on step quality using Nielsen's formula
289///
290/// # Convergence Criteria
291///
292/// The optimizer terminates when ANY of the following conditions is met:
293///
294/// - **Cost tolerance**: `|cost_k - cost_{k-1}| < cost_tolerance`
295/// - **Parameter tolerance**: `||step|| < parameter_tolerance`
296/// - **Gradient tolerance**: `||J^T·r|| < gradient_tolerance`
297/// - **Maximum iterations**: `iteration >= max_iterations`
298/// - **Timeout**: `elapsed_time >= timeout`
299///
300/// # See Also
301///
302/// - [`LevenbergMarquardt`] - The solver that uses this configuration
303/// - [`GaussNewtonConfig`](crate::GaussNewtonConfig) - Undamped variant
304/// - [`DogLegConfig`](crate::DogLegConfig) - Trust region alternative
305#[derive(Clone)]
306pub struct LevenbergMarquardtConfig {
307    /// Type of linear solver for the linear systems
308    pub linear_solver_type: LinearSolverType,
309    /// Maximum number of iterations
310    pub max_iterations: usize,
311    /// Convergence tolerance for cost function
312    pub cost_tolerance: f64,
313    /// Convergence tolerance for parameter updates
314    pub parameter_tolerance: f64,
315    /// Convergence tolerance for gradient norm
316    pub gradient_tolerance: f64,
317    /// Timeout duration
318    pub timeout: Option<Duration>,
319    /// Enable detailed logging
320    pub verbose: bool,
321    /// Initial damping parameter
322    pub damping: f64,
323    /// Minimum damping parameter
324    pub damping_min: f64,
325    /// Maximum damping parameter
326    pub damping_max: f64,
327    /// Damping increase factor (when step rejected)
328    pub damping_increase_factor: f64,
329    /// Damping decrease factor (when step accepted)
330    pub damping_decrease_factor: f64,
331    /// Damping nu parameter
332    pub damping_nu: f64,
333    /// Trust region radius
334    pub trust_region_radius: f64,
335    /// Minimum step quality for acceptance
336    pub min_step_quality: f64,
337    /// Good step quality threshold
338    pub good_step_quality: f64,
339    /// Minimum diagonal value for regularization
340    pub min_diagonal: f64,
341    /// Maximum diagonal value for regularization
342    pub max_diagonal: f64,
343    /// Minimum objective function cutoff (optional early termination)
344    ///
345    /// If set, optimization terminates when cost falls below this threshold.
346    /// Useful for early stopping when a "good enough" solution is acceptable.
347    ///
348    /// Default: None (disabled)
349    pub min_cost_threshold: Option<f64>,
350    /// Minimum trust region radius before termination
351    ///
352    /// When the trust region radius falls below this value, the optimizer
353    /// terminates as it indicates the search has converged or the problem
354    /// is ill-conditioned. Matches Ceres Solver's min_trust_region_radius.
355    ///
356    /// Default: 1e-32 (Ceres-compatible)
357    pub min_trust_region_radius: f64,
358    /// Maximum condition number for Jacobian matrix (optional check)
359    ///
360    /// If set, the optimizer checks if condition_number(J^T*J) exceeds this
361    /// threshold and terminates with IllConditionedJacobian status.
362    /// Note: Computing condition number is expensive, so this is disabled by default.
363    ///
364    /// Default: None (disabled)
365    pub max_condition_number: Option<f64>,
366    /// Minimum relative cost decrease for step acceptance
367    ///
368    /// Used in computing step quality (rho = actual_reduction / predicted_reduction).
369    /// Steps with rho < min_relative_decrease are rejected. Matches Ceres Solver's
370    /// min_relative_decrease parameter.
371    ///
372    /// Default: 1e-3 (Ceres-compatible)
373    pub min_relative_decrease: f64,
374    /// Use Jacobi column scaling (preconditioning)
375    ///
376    /// When enabled, normalizes Jacobian columns by their L2 norm before solving.
377    /// This can improve convergence for problems with mixed parameter scales
378    /// (e.g., positions in meters + angles in radians) but adds ~5-10% overhead.
379    ///
380    /// Default: false (to avoid performance overhead and faster convergence)
381    pub use_jacobi_scaling: bool,
382    /// Compute per-variable covariance matrices (uncertainty estimation)
383    ///
384    /// When enabled, computes covariance by inverting the Hessian matrix after
385    /// convergence. The full covariance matrix is extracted into per-variable
386    /// blocks stored in both Variable structs and SolverResult.
387    ///
388    /// Default: false (to avoid performance overhead)
389    pub compute_covariances: bool,
390    /// Enable real-time Rerun visualization (graphical debugging)
391    ///
392    /// When enabled, logs optimization progress to Rerun viewer including:
393    /// - Time series plots (cost, gradient norm, damping, step quality)
394    /// - Sparse Hessian heat map visualization
395    /// - Gradient vector visualization
396    /// - Manifold state updates (for SE2/SE3 problems)
397    ///
398    /// This is automatically disabled in release builds (zero overhead).
399    /// Use `verbose` for terminal output; this is for graphical visualization.
400    ///
401    /// Default: false
402    pub enable_visualization: bool,
403}
404
405impl Default for LevenbergMarquardtConfig {
406    fn default() -> Self {
407        Self {
408            linear_solver_type: LinearSolverType::default(),
409            // Ceres Solver default: 50 (changed from 100 for compatibility)
410            max_iterations: 50,
411            // Ceres Solver default: 1e-6 (changed from 1e-8 for compatibility)
412            cost_tolerance: 1e-6,
413            // Ceres Solver default: 1e-8 (unchanged)
414            parameter_tolerance: 1e-8,
415            // Ceres Solver default: 1e-10 (changed from 1e-8 for compatibility)
416            // Note: Typically should be 1e-4 * cost_tolerance per Ceres docs
417            gradient_tolerance: 1e-10,
418            timeout: None,
419            verbose: false,
420            damping: 1e-4,
421            damping_min: 1e-12,
422            damping_max: 1e12,
423            damping_increase_factor: 10.0,
424            damping_decrease_factor: 0.3,
425            damping_nu: 2.0,
426            trust_region_radius: 1e4,
427            min_step_quality: 0.0,
428            good_step_quality: 0.75,
429            min_diagonal: 1e-6,
430            max_diagonal: 1e32,
431            // New Ceres-compatible parameters
432            min_cost_threshold: None,
433            min_trust_region_radius: 1e-32,
434            max_condition_number: None,
435            min_relative_decrease: 1e-3,
436            // Existing parameters
437            use_jacobi_scaling: false,
438            compute_covariances: false,
439            enable_visualization: false,
440        }
441    }
442}
443
444impl LevenbergMarquardtConfig {
445    /// Create a new Levenberg-Marquardt configuration with default values.
446    pub fn new() -> Self {
447        Self::default()
448    }
449
450    /// Set the linear solver type
451    pub fn with_linear_solver_type(mut self, linear_solver_type: LinearSolverType) -> Self {
452        self.linear_solver_type = linear_solver_type;
453        self
454    }
455
456    /// Set the maximum number of iterations
457    pub fn with_max_iterations(mut self, max_iterations: usize) -> Self {
458        self.max_iterations = max_iterations;
459        self
460    }
461
462    /// Set the cost tolerance
463    pub fn with_cost_tolerance(mut self, cost_tolerance: f64) -> Self {
464        self.cost_tolerance = cost_tolerance;
465        self
466    }
467
468    /// Set the parameter tolerance
469    pub fn with_parameter_tolerance(mut self, parameter_tolerance: f64) -> Self {
470        self.parameter_tolerance = parameter_tolerance;
471        self
472    }
473
474    /// Set the gradient tolerance
475    pub fn with_gradient_tolerance(mut self, gradient_tolerance: f64) -> Self {
476        self.gradient_tolerance = gradient_tolerance;
477        self
478    }
479
480    /// Set the timeout duration
481    pub fn with_timeout(mut self, timeout: Duration) -> Self {
482        self.timeout = Some(timeout);
483        self
484    }
485
486    /// Enable or disable verbose logging
487    pub fn with_verbose(mut self, verbose: bool) -> Self {
488        self.verbose = verbose;
489        self
490    }
491
492    /// Set the initial damping parameter.
493    pub fn with_damping(mut self, damping: f64) -> Self {
494        self.damping = damping;
495        self
496    }
497
498    /// Set the damping parameter bounds.
499    pub fn with_damping_bounds(mut self, min: f64, max: f64) -> Self {
500        self.damping_min = min;
501        self.damping_max = max;
502        self
503    }
504
505    /// Set the damping adjustment factors.
506    pub fn with_damping_factors(mut self, increase: f64, decrease: f64) -> Self {
507        self.damping_increase_factor = increase;
508        self.damping_decrease_factor = decrease;
509        self
510    }
511
512    /// Set the trust region parameters.
513    pub fn with_trust_region(mut self, radius: f64, min_quality: f64, good_quality: f64) -> Self {
514        self.trust_region_radius = radius;
515        self.min_step_quality = min_quality;
516        self.good_step_quality = good_quality;
517        self
518    }
519
520    /// Set minimum objective function cutoff for early termination.
521    ///
522    /// When set, optimization terminates with MinCostThresholdReached status
523    /// if the cost falls below this threshold. Useful for early stopping when
524    /// a "good enough" solution is acceptable.
525    pub fn with_min_cost_threshold(mut self, min_cost: f64) -> Self {
526        self.min_cost_threshold = Some(min_cost);
527        self
528    }
529
530    /// Set minimum trust region radius before termination.
531    ///
532    /// When the trust region radius falls below this value, optimization
533    /// terminates with TrustRegionRadiusTooSmall status.
534    /// Default: 1e-32 (Ceres-compatible)
535    pub fn with_min_trust_region_radius(mut self, min_radius: f64) -> Self {
536        self.min_trust_region_radius = min_radius;
537        self
538    }
539
540    /// Set maximum condition number for Jacobian matrix.
541    ///
542    /// If set, the optimizer checks if condition_number(J^T*J) exceeds this
543    /// threshold and terminates with IllConditionedJacobian status.
544    /// Note: Computing condition number is expensive, disabled by default.
545    pub fn with_max_condition_number(mut self, max_cond: f64) -> Self {
546        self.max_condition_number = Some(max_cond);
547        self
548    }
549
550    /// Set minimum relative cost decrease for step acceptance.
551    ///
552    /// Steps with rho = (actual_reduction / predicted_reduction) below this
553    /// threshold are rejected. Default: 1e-3 (Ceres-compatible)
554    pub fn with_min_relative_decrease(mut self, min_decrease: f64) -> Self {
555        self.min_relative_decrease = min_decrease;
556        self
557    }
558
559    /// Enable or disable Jacobi column scaling (preconditioning).
560    ///
561    /// When enabled, normalizes Jacobian columns by their L2 norm before solving.
562    /// Can improve convergence for mixed-scale problems but adds ~5-10% overhead.
563    pub fn with_jacobi_scaling(mut self, use_jacobi_scaling: bool) -> Self {
564        self.use_jacobi_scaling = use_jacobi_scaling;
565        self
566    }
567
568    /// Enable or disable covariance computation (uncertainty estimation).
569    ///
570    /// When enabled, computes the full covariance matrix by inverting the Hessian
571    /// after convergence, then extracts per-variable covariance blocks.
572    pub fn with_compute_covariances(mut self, compute_covariances: bool) -> Self {
573        self.compute_covariances = compute_covariances;
574        self
575    }
576
577    /// Enable real-time visualization (graphical debugging).
578    ///
579    /// When enabled, optimization progress is logged to a Rerun viewer with:
580    /// - Time series plots of cost, gradient norm, damping, step quality
581    /// - Sparse Hessian matrix visualization as heat map
582    /// - Gradient vector visualization
583    /// - Real-time manifold state updates (for SE2/SE3 problems)
584    ///
585    /// Note: Has zero overhead when disabled. Use `verbose` for terminal logging.
586    ///
587    /// # Arguments
588    ///
589    /// * `enable` - Whether to enable visualization
590    pub fn with_visualization(mut self, enable: bool) -> Self {
591        self.enable_visualization = enable;
592        self
593    }
594}
595
596/// State for optimization iteration
597struct LinearizerResult {
598    variables: HashMap<String, VariableEnum>,
599    variable_index_map: HashMap<String, usize>,
600    sorted_vars: Vec<String>,
601    symbolic_structure: SymbolicStructure,
602    current_cost: f64,
603    initial_cost: f64,
604}
605
606/// Result from step computation
607struct StepResult {
608    step: Mat<f64>,
609    gradient_norm: f64,
610    predicted_reduction: f64,
611}
612
613/// Result from step evaluation
614struct StepEvaluation {
615    accepted: bool,
616    cost_reduction: f64,
617    rho: f64,
618}
619
620/// Levenberg-Marquardt solver for nonlinear least squares optimization.
621///
622/// Implements the damped Gauss-Newton method with adaptive damping parameter λ that
623/// interpolates between Gauss-Newton and gradient descent based on step quality.
624///
625/// # Algorithm
626///
627/// At each iteration k:
628/// 1. Compute residual `r(xₖ)` and Jacobian `J(xₖ)`
629/// 2. Solve augmented system: `(J^T·J + λI)·h = -J^T·r`
630/// 3. Evaluate step quality: `ρ = (actual reduction) / (predicted reduction)`
631/// 4. If `ρ > 0`: Accept step and update `xₖ₊₁ = xₖ ⊕ h`, decrease λ
632/// 5. If `ρ ≤ 0`: Reject step (keep `xₖ₊₁ = xₖ`), increase λ
633/// 6. Check convergence criteria
634///
635/// The damping parameter λ is updated using Nielsen's smooth formula:
636/// `λₖ₊₁ = λₖ · max(1/3, 1 - (2ρ - 1)³)` for accepted steps,
637/// or `λₖ₊₁ = λₖ · ν` (with increasing ν) for rejected steps.
638///
639/// # Examples
640///
641/// ```no_run
642/// use apex_solver::optimizer::levenberg_marquardt::LevenbergMarquardt;
643/// use apex_solver::core::problem::Problem;
644/// use std::collections::HashMap;
645///
646/// # fn main() -> Result<(), Box<dyn std::error::Error>> {
647/// let mut problem = Problem::new();
648/// // ... add factors to problem ...
649///
650/// let initial_values = HashMap::new();
651/// // ... initialize parameters ...
652///
653/// let mut solver = LevenbergMarquardt::new();
654/// let result = solver.optimize(&problem, &initial_values)?;
655///
656/// println!("Status: {:?}", result.status);
657/// println!("Final cost: {}", result.final_cost);
658/// println!("Iterations: {}", result.iterations);
659/// # Ok(())
660/// # }
661/// ```
662///
663/// # See Also
664///
665/// - [`LevenbergMarquardtConfig`] - Configuration options
666/// - [`GaussNewton`](crate::GaussNewton) - Undamped variant (faster but less robust)
667/// - [`DogLeg`](crate::DogLeg) - Alternative trust region method
668pub struct LevenbergMarquardt {
669    config: LevenbergMarquardtConfig,
670    jacobi_scaling: Option<SparseColMat<usize, f64>>,
671    visualizer: Option<OptimizationVisualizer>,
672}
673
674impl Default for LevenbergMarquardt {
675    fn default() -> Self {
676        Self::new()
677    }
678}
679
680impl LevenbergMarquardt {
681    /// Create a new Levenberg-Marquardt solver with default configuration.
682    pub fn new() -> Self {
683        Self::with_config(LevenbergMarquardtConfig::default())
684    }
685
686    /// Create a new Levenberg-Marquardt solver with the given configuration.
687    pub fn with_config(config: LevenbergMarquardtConfig) -> Self {
688        // Create visualizer if enabled (zero overhead when disabled)
689        let visualizer = if config.enable_visualization {
690            match OptimizationVisualizer::new(true) {
691                Ok(vis) => Some(vis),
692                Err(e) => {
693                    eprintln!("[WARNING] Failed to create visualizer: {}", e);
694                    None
695                }
696            }
697        } else {
698            None
699        };
700
701        Self {
702            config,
703            jacobi_scaling: None,
704            visualizer,
705        }
706    }
707
708    /// Create the appropriate linear solver based on configuration
709    fn create_linear_solver(&self) -> Box<dyn SparseLinearSolver> {
710        match self.config.linear_solver_type {
711            LinearSolverType::SparseCholesky => Box::new(SparseCholeskySolver::new()),
712            LinearSolverType::SparseQR => Box::new(SparseQRSolver::new()),
713        }
714    }
715
716    /// Update damping parameter based on step quality using trust region approach
717    /// Reference: Introduction to Optimization and Data Fitting
718    /// Algorithm 6.18
719    fn update_damping(&mut self, rho: f64) -> bool {
720        if rho > 0.0 {
721            // Step accepted - decrease damping
722            let coff = 2.0 * rho - 1.0;
723            self.config.damping *= (1.0_f64 / 3.0).max(1.0 - coff * coff * coff);
724            self.config.damping = self.config.damping.max(self.config.damping_min);
725            self.config.damping_nu = 2.0;
726            true
727        } else {
728            // Step rejected - increase damping
729            self.config.damping *= self.config.damping_nu;
730            self.config.damping_nu *= 2.0;
731            self.config.damping = self.config.damping.min(self.config.damping_max);
732            false
733        }
734    }
735
736    /// Compute step quality ratio (actual vs predicted reduction)
737    /// Reference: Introduction to Optimization and Data Fitting
738    /// Reference: Damping parameter in marquardt's method
739    /// Formula 2.2
740    fn compute_step_quality(
741        &self,
742        current_cost: f64,
743        new_cost: f64,
744        predicted_reduction: f64,
745    ) -> f64 {
746        let actual_reduction = current_cost - new_cost;
747        if predicted_reduction.abs() < 1e-15 {
748            if actual_reduction > 0.0 { 1.0 } else { 0.0 }
749        } else {
750            actual_reduction / predicted_reduction
751        }
752    }
753
754    /// Compute predicted cost reduction from linear model
755    /// Standard LM formula: 0.5 * step^T * (damping * step - gradient)
756    fn compute_predicted_reduction(&self, step: &Mat<f64>, gradient: &Mat<f64>) -> f64 {
757        // Standard Levenberg-Marquardt predicted reduction formula
758        // predicted_reduction = -step^T * gradient - 0.5 * step^T * H * step
759        //                     = 0.5 * step^T * (damping * step - gradient)
760        let diff = self.config.damping * step - gradient;
761        (0.5 * step.transpose() * &diff)[(0, 0)]
762    }
763
764    /// Check convergence criteria
765    /// Check convergence using comprehensive termination criteria.
766    ///
767    /// Implements 9 termination criteria following Ceres Solver standards:
768    ///
769    /// 1. **Gradient Norm (First-Order Optimality)**: ||g||∞ ≤ gradient_tolerance
770    /// 2. **Parameter Change Tolerance**: ||h|| ≤ parameter_tolerance * (||x|| + parameter_tolerance)
771    /// 3. **Function Value Change Tolerance**: |ΔF| < cost_tolerance * F
772    /// 4. **Objective Function Cutoff**: F_new < min_cost_threshold (optional)
773    /// 5. **Trust Region Radius**: radius < min_trust_region_radius
774    /// 6. **Singular/Ill-Conditioned Jacobian**: Detected during linear solve
775    /// 7. **Invalid Numerical Values**: NaN or Inf in cost or parameters
776    /// 8. **Maximum Iterations**: iteration >= max_iterations
777    /// 9. **Timeout**: elapsed >= timeout
778    ///
779    /// # Arguments
780    ///
781    /// * `iteration` - Current iteration number
782    /// * `current_cost` - Cost before applying the step
783    /// * `new_cost` - Cost after applying the step
784    /// * `parameter_norm` - L2 norm of current parameter vector ||x||
785    /// * `parameter_update_norm` - L2 norm of parameter update step ||h||
786    /// * `gradient_norm` - Infinity norm of gradient ||g||∞
787    /// * `trust_region_radius` - Current trust region radius
788    /// * `elapsed` - Elapsed time since optimization start
789    /// * `step_accepted` - Whether the current step was accepted
790    ///
791    /// # Returns
792    ///
793    /// `Some(OptimizationStatus)` if any termination criterion is satisfied, `None` otherwise.
794    #[allow(clippy::too_many_arguments)]
795    fn check_convergence(
796        &self,
797        iteration: usize,
798        current_cost: f64,
799        new_cost: f64,
800        parameter_norm: f64,
801        parameter_update_norm: f64,
802        gradient_norm: f64,
803        trust_region_radius: f64,
804        elapsed: Duration,
805        step_accepted: bool,
806    ) -> Option<OptimizationStatus> {
807        // ========================================================================
808        // CRITICAL SAFETY CHECKS (perform first, before convergence checks)
809        // ========================================================================
810
811        // CRITERION 7: Invalid Numerical Values (NaN/Inf)
812        // Always check for numerical instability first
813        if !new_cost.is_finite() || !parameter_update_norm.is_finite() || !gradient_norm.is_finite()
814        {
815            return Some(OptimizationStatus::InvalidNumericalValues);
816        }
817
818        // CRITERION 9: Timeout
819        // Check wall-clock time limit
820        if let Some(timeout) = self.config.timeout
821            && elapsed >= timeout
822        {
823            return Some(OptimizationStatus::Timeout);
824        }
825
826        // CRITERION 8: Maximum Iterations
827        // Check iteration count limit
828        if iteration >= self.config.max_iterations {
829            return Some(OptimizationStatus::MaxIterationsReached);
830        }
831
832        // ========================================================================
833        // CONVERGENCE CRITERIA (only check after successful steps)
834        // ========================================================================
835
836        // Only check convergence criteria after accepted steps
837        // (rejected steps don't indicate convergence)
838        if !step_accepted {
839            return None;
840        }
841
842        // CRITERION 1: Gradient Norm (First-Order Optimality)
843        // Check if gradient infinity norm is below threshold
844        // This indicates we're at a critical point (local minimum, saddle, or maximum)
845        if gradient_norm < self.config.gradient_tolerance {
846            return Some(OptimizationStatus::GradientToleranceReached);
847        }
848
849        // Only check parameter and cost criteria after first iteration
850        if iteration > 0 {
851            // CRITERION 2: Parameter Change Tolerance (xtol)
852            // Ceres formula: ||h|| ≤ ε_param * (||x|| + ε_param)
853            // This is a relative measure that scales with parameter magnitude
854            let relative_step_tolerance = self.config.parameter_tolerance
855                * (parameter_norm + self.config.parameter_tolerance);
856
857            if parameter_update_norm <= relative_step_tolerance {
858                return Some(OptimizationStatus::ParameterToleranceReached);
859            }
860
861            // CRITERION 3: Function Value Change Tolerance (ftol)
862            // Ceres formula: |ΔF| < ε_cost * F
863            // Check relative cost change (not absolute)
864            let cost_change = (current_cost - new_cost).abs();
865            let relative_cost_change = cost_change / current_cost.max(1e-10); // Avoid division by zero
866
867            if relative_cost_change < self.config.cost_tolerance {
868                return Some(OptimizationStatus::CostToleranceReached);
869            }
870        }
871
872        // CRITERION 4: Objective Function Cutoff (optional early stopping)
873        // Useful for "good enough" solutions
874        if let Some(min_cost) = self.config.min_cost_threshold
875            && new_cost < min_cost
876        {
877            return Some(OptimizationStatus::MinCostThresholdReached);
878        }
879
880        // CRITERION 5: Trust Region Radius
881        // If trust region has collapsed, optimization has converged or problem is ill-conditioned
882        if trust_region_radius < self.config.min_trust_region_radius {
883            return Some(OptimizationStatus::TrustRegionRadiusTooSmall);
884        }
885
886        // CRITERION 6: Singular/Ill-Conditioned Jacobian
887        // Note: This is typically detected during the linear solve and handled there
888        // The max_condition_number check would be expensive to compute here
889        // If linear solve fails, it returns an error that's converted to NumericalFailure
890
891        // No termination criterion satisfied
892        None
893    }
894
895    /// Compute total parameter vector norm ||x||.
896    ///
897    /// Computes the L2 norm of all parameter vectors concatenated together.
898    /// This is used in the relative parameter tolerance check.
899    ///
900    /// # Arguments
901    ///
902    /// * `variables` - Map of variable names to their current values
903    ///
904    /// # Returns
905    ///
906    /// The L2 norm of the concatenated parameter vector
907    fn compute_parameter_norm(variables: &HashMap<String, VariableEnum>) -> f64 {
908        variables
909            .values()
910            .map(|v| {
911                let vec = v.to_vector();
912                vec.norm_squared()
913            })
914            .sum::<f64>()
915            .sqrt()
916    }
917
918    /// Create Jacobi scaling matrix from Jacobian
919    fn create_jacobi_scaling(
920        &self,
921        jacobian: &SparseColMat<usize, f64>,
922    ) -> SparseColMat<usize, f64> {
923        let cols = jacobian.ncols();
924        let jacobi_scaling_vec: Vec<Triplet<usize, usize, f64>> = (0..cols)
925            .map(|c| {
926                // Compute column norm: sqrt(sum(J_col^2))
927                let col_norm_squared: f64 = jacobian
928                    .triplet_iter()
929                    .filter(|t| t.col == c)
930                    .map(|t| t.val * t.val)
931                    .sum();
932                let col_norm = col_norm_squared.sqrt();
933                // Scaling factor: 1.0 / (1.0 + col_norm)
934                let scaling = 1.0 / (1.0 + col_norm);
935                Triplet::new(c, c, scaling)
936            })
937            .collect();
938
939        SparseColMat::try_new_from_triplets(cols, cols, &jacobi_scaling_vec)
940            .expect("Failed to create Jacobi scaling matrix")
941    }
942
943    /// Initialize optimization state from problem and initial parameters
944    fn initialize_optimization_state(
945        &self,
946        problem: &Problem,
947        initial_params: &HashMap<String, (ManifoldType, DVector<f64>)>,
948    ) -> Result<LinearizerResult, error::ApexError> {
949        // Initialize variables from initial values
950        let variables = problem.initialize_variables(initial_params);
951
952        // Create column mapping for variables
953        let mut variable_index_map = HashMap::new();
954        let mut col_offset = 0;
955        let mut sorted_vars: Vec<String> = variables.keys().cloned().collect();
956        sorted_vars.sort();
957
958        for var_name in &sorted_vars {
959            variable_index_map.insert(var_name.clone(), col_offset);
960            col_offset += variables[var_name].get_size();
961        }
962
963        // Build symbolic structure for sparse operations
964        let symbolic_structure =
965            problem.build_symbolic_structure(&variables, &variable_index_map, col_offset)?;
966
967        // Initial cost evaluation (residual only, no Jacobian needed)
968        let residual = problem.compute_residual_sparse(&variables)?;
969        let current_cost = compute_cost(&residual);
970        let initial_cost = current_cost;
971
972        if self.config.verbose {
973            println!(
974                "Starting Levenberg-Marquardt optimization with {} max iterations",
975                self.config.max_iterations
976            );
977            println!(
978                "Initial cost: {:.6e}, initial damping: {:.6e}",
979                current_cost, self.config.damping
980            );
981        }
982
983        Ok(LinearizerResult {
984            variables,
985            variable_index_map,
986            sorted_vars,
987            symbolic_structure,
988            current_cost,
989            initial_cost,
990        })
991    }
992
993    /// Process Jacobian by creating and applying Jacobi scaling if enabled
994    fn process_jacobian(
995        &mut self,
996        jacobian: &SparseColMat<usize, f64>,
997        iteration: usize,
998    ) -> SparseColMat<usize, f64> {
999        // Create Jacobi scaling on first iteration if enabled
1000        if iteration == 0 {
1001            let scaling = self.create_jacobi_scaling(jacobian);
1002
1003            if self.config.verbose {
1004                println!("Jacobi scaling computed for {} columns", scaling.ncols());
1005            }
1006
1007            self.jacobi_scaling = Some(scaling);
1008        }
1009        jacobian * self.jacobi_scaling.as_ref().unwrap()
1010    }
1011
1012    /// Compute optimization step by solving the augmented system
1013    fn compute_levenberg_marquardt_step(
1014        &self,
1015        residuals: &Mat<f64>,
1016        scaled_jacobian: &SparseColMat<usize, f64>,
1017        linear_solver: &mut Box<dyn SparseLinearSolver>,
1018    ) -> Option<StepResult> {
1019        // Solve augmented equation: (J_scaled^T * J_scaled + λI) * dx_scaled = -J_scaled^T * r
1020        let residuals_owned = residuals.as_ref().to_owned();
1021        let scaled_step = linear_solver
1022            .solve_augmented_equation(&residuals_owned, scaled_jacobian, self.config.damping)
1023            .ok()?;
1024
1025        // Get cached gradient and Hessian from the solver
1026        let gradient = linear_solver.get_gradient()?;
1027        let hessian = linear_solver.get_hessian()?;
1028        let gradient_norm = gradient.norm_l2();
1029
1030        if self.config.verbose {
1031            println!("Gradient (J^T*r) norm: {:.12e}", gradient_norm);
1032            println!("Hessian shape: ({}, {})", hessian.nrows(), hessian.ncols());
1033            println!("Damping parameter: {:.12e}", self.config.damping);
1034        }
1035
1036        // Apply inverse Jacobi scaling to get final step (if enabled)
1037        let step = if self.config.use_jacobi_scaling {
1038            &scaled_step * self.jacobi_scaling.as_ref().unwrap()
1039        } else {
1040            scaled_step
1041        };
1042
1043        if self.config.verbose {
1044            println!("Final step norm: {:.12e}", step.norm_l2());
1045        }
1046
1047        // Compute predicted reduction using scaled values
1048        let predicted_reduction = self.compute_predicted_reduction(&step, gradient);
1049
1050        if self.config.verbose {
1051            println!("Predicted reduction: {:.12e}", predicted_reduction);
1052        }
1053
1054        Some(StepResult {
1055            step,
1056            gradient_norm,
1057            predicted_reduction,
1058        })
1059    }
1060
1061    /// Evaluate and apply step, handling acceptance/rejection based on step quality
1062    fn evaluate_and_apply_step(
1063        &mut self,
1064        step_result: &StepResult,
1065        state: &mut LinearizerResult,
1066        problem: &Problem,
1067    ) -> error::ApexResult<StepEvaluation> {
1068        // Apply parameter updates using manifold operations
1069        let _step_norm = apply_parameter_step(
1070            &mut state.variables,
1071            step_result.step.as_ref(),
1072            &state.sorted_vars,
1073        );
1074
1075        // Compute new cost (residual only, no Jacobian needed for step evaluation)
1076        let new_residual = problem.compute_residual_sparse(&state.variables)?;
1077        let new_cost = compute_cost(&new_residual);
1078
1079        // Compute step quality
1080        let rho = self.compute_step_quality(
1081            state.current_cost,
1082            new_cost,
1083            step_result.predicted_reduction,
1084        );
1085
1086        if self.config.verbose {
1087            println!("RHO (Gain Factor) CALCULATION DETAILS");
1088            println!("Old cost: {:.12e}", state.current_cost);
1089            println!("New cost: {:.12e}", new_cost);
1090            let actual_reduction = state.current_cost - new_cost;
1091            println!("Actual cost reduction: {:.12e}", actual_reduction);
1092            println!(
1093                "Predicted cost reduction: {:.12e}",
1094                step_result.predicted_reduction
1095            );
1096            println!("Rho (actual/predicted): {:.12e}", rho);
1097        }
1098
1099        // Update damping and decide whether to accept step
1100        let accepted = self.update_damping(rho);
1101
1102        let cost_reduction = if accepted {
1103            // Accept the step - parameters already updated
1104            let reduction = state.current_cost - new_cost;
1105            state.current_cost = new_cost;
1106            reduction
1107        } else {
1108            // Reject the step - revert parameter changes
1109            apply_negative_parameter_step(
1110                &mut state.variables,
1111                step_result.step.as_ref(),
1112                &state.sorted_vars,
1113            );
1114            0.0
1115        };
1116
1117        Ok(StepEvaluation {
1118            accepted,
1119            cost_reduction,
1120            rho,
1121        })
1122    }
1123
1124    /// Create optimization summary
1125    #[allow(clippy::too_many_arguments)]
1126    fn create_summary(
1127        &self,
1128        initial_cost: f64,
1129        final_cost: f64,
1130        iterations: usize,
1131        successful_steps: usize,
1132        unsuccessful_steps: usize,
1133        max_gradient_norm: f64,
1134        final_gradient_norm: f64,
1135        max_parameter_update_norm: f64,
1136        final_parameter_update_norm: f64,
1137        total_cost_reduction: f64,
1138        total_time: Duration,
1139    ) -> LevenbergMarquardtSummary {
1140        LevenbergMarquardtSummary {
1141            initial_cost,
1142            final_cost,
1143            iterations,
1144            successful_steps,
1145            unsuccessful_steps,
1146            final_damping: self.config.damping,
1147            average_cost_reduction: if iterations > 0 {
1148                total_cost_reduction / iterations as f64
1149            } else {
1150                0.0
1151            },
1152            max_gradient_norm,
1153            final_gradient_norm,
1154            max_parameter_update_norm,
1155            final_parameter_update_norm,
1156            total_time,
1157            average_time_per_iteration: if iterations > 0 {
1158                total_time / iterations as u32
1159            } else {
1160                Duration::from_secs(0)
1161            },
1162        }
1163    }
1164
1165    pub fn optimize(
1166        &mut self,
1167        problem: &Problem,
1168        initial_params: &HashMap<String, (ManifoldType, DVector<f64>)>,
1169    ) -> Result<SolverResult<HashMap<String, VariableEnum>>, error::ApexError> {
1170        let start_time = Instant::now();
1171        let mut iteration = 0;
1172        let mut cost_evaluations = 1; // Initial cost evaluation
1173        let mut jacobian_evaluations = 0;
1174        let mut successful_steps = 0;
1175        let mut unsuccessful_steps = 0;
1176
1177        // Initialize optimization state
1178        let mut state = self.initialize_optimization_state(problem, initial_params)?;
1179
1180        // Create linear solver
1181        let mut linear_solver = self.create_linear_solver();
1182
1183        // Initialize summary tracking variables
1184        let mut max_gradient_norm: f64 = 0.0;
1185        let mut max_parameter_update_norm: f64 = 0.0;
1186        let mut total_cost_reduction = 0.0;
1187        let mut final_gradient_norm;
1188        let mut final_parameter_update_norm;
1189
1190        // Main optimization loop
1191        loop {
1192            // Evaluate residuals and Jacobian
1193            let (residuals, jacobian) = problem.compute_residual_and_jacobian_sparse(
1194                &state.variables,
1195                &state.variable_index_map,
1196                &state.symbolic_structure,
1197            )?;
1198            jacobian_evaluations += 1;
1199
1200            if self.config.verbose {
1201                println!("APEX-SOLVER DEBUG ITERATION {}", iteration);
1202                println!("Current cost: {:.12e}", state.current_cost);
1203                println!(
1204                    "Residuals shape: ({}, {})",
1205                    residuals.nrows(),
1206                    residuals.ncols()
1207                );
1208                println!("Residuals norm: {:.12e}", residuals.norm_l2());
1209                println!(
1210                    "Jacobian shape: ({}, {})",
1211                    jacobian.nrows(),
1212                    jacobian.ncols()
1213                );
1214            }
1215
1216            // Process Jacobian (apply scaling if enabled)
1217            let scaled_jacobian = if self.config.use_jacobi_scaling {
1218                self.process_jacobian(&jacobian, iteration)
1219            } else {
1220                jacobian
1221            };
1222
1223            if self.config.verbose {
1224                println!(
1225                    "Scaled Jacobian shape: ({}, {})",
1226                    scaled_jacobian.nrows(),
1227                    scaled_jacobian.ncols()
1228                );
1229            }
1230
1231            // Compute optimization step
1232            let step_result = match self.compute_levenberg_marquardt_step(
1233                &residuals,
1234                &scaled_jacobian,
1235                &mut linear_solver,
1236            ) {
1237                Some(result) => result,
1238                None => {
1239                    return Err(error::ApexError::Solver(
1240                        "Linear solver failed to solve augmented system".to_string(),
1241                    ));
1242                }
1243            };
1244
1245            // Update tracking variables
1246            max_gradient_norm = max_gradient_norm.max(step_result.gradient_norm);
1247            final_gradient_norm = step_result.gradient_norm;
1248            let step_norm = step_result.step.norm_l2();
1249            max_parameter_update_norm = max_parameter_update_norm.max(step_norm);
1250            final_parameter_update_norm = step_norm;
1251
1252            // Evaluate and apply step (handles accept/reject)
1253            let step_eval = self.evaluate_and_apply_step(&step_result, &mut state, problem)?;
1254            cost_evaluations += 1;
1255
1256            // Update counters based on acceptance
1257            if step_eval.accepted {
1258                successful_steps += 1;
1259                total_cost_reduction += step_eval.cost_reduction;
1260            } else {
1261                unsuccessful_steps += 1;
1262            }
1263
1264            // Rerun visualization
1265            if let Some(ref vis) = self.visualizer {
1266                if let Err(e) = vis.log_scalars(
1267                    iteration,
1268                    state.current_cost,
1269                    step_result.gradient_norm,
1270                    self.config.damping,
1271                    step_norm,
1272                    Some(step_eval.rho),
1273                ) {
1274                    eprintln!("[WARNING] Failed to log scalars: {}", e);
1275                }
1276
1277                // Log expensive visualizations (Hessian, gradient, manifolds)
1278                if let Err(e) = vis.log_hessian(linear_solver.get_hessian(), iteration) {
1279                    eprintln!("[WARNING] Failed to log Hessian: {}", e);
1280                }
1281                if let Err(e) = vis.log_gradient(linear_solver.get_gradient(), iteration) {
1282                    eprintln!("[WARNING] Failed to log gradient: {}", e);
1283                }
1284                if let Err(e) = vis.log_manifolds(&state.variables, iteration) {
1285                    eprintln!("[WARNING] Failed to log manifolds: {}", e);
1286                }
1287            }
1288
1289            // Check convergence
1290            let elapsed = start_time.elapsed();
1291
1292            // Compute parameter norm for relative parameter tolerance check
1293            let parameter_norm = Self::compute_parameter_norm(&state.variables);
1294
1295            // Compute new cost for convergence check (state may already have new cost if step accepted)
1296            let new_cost = if step_eval.accepted {
1297                state.current_cost
1298            } else {
1299                // Use cost before step application
1300                state.current_cost
1301            };
1302
1303            // Cost before this step (need to add back reduction if step was accepted)
1304            let cost_before_step = if step_eval.accepted {
1305                state.current_cost + step_eval.cost_reduction
1306            } else {
1307                state.current_cost
1308            };
1309
1310            if let Some(status) = self.check_convergence(
1311                iteration,
1312                cost_before_step,
1313                new_cost,
1314                parameter_norm,
1315                step_norm,
1316                step_result.gradient_norm,
1317                self.config.trust_region_radius,
1318                elapsed,
1319                step_eval.accepted,
1320            ) {
1321                let summary = self.create_summary(
1322                    state.initial_cost,
1323                    state.current_cost,
1324                    iteration + 1,
1325                    successful_steps,
1326                    unsuccessful_steps,
1327                    max_gradient_norm,
1328                    final_gradient_norm,
1329                    max_parameter_update_norm,
1330                    final_parameter_update_norm,
1331                    total_cost_reduction,
1332                    elapsed,
1333                );
1334
1335                if self.config.verbose {
1336                    println!("{}", summary);
1337                }
1338
1339                // Log convergence to Rerun
1340                if let Some(ref vis) = self.visualizer {
1341                    let _ = vis.log_convergence(&format!("Converged: {}", status));
1342                }
1343
1344                // Compute covariances if enabled
1345                let covariances = if self.config.compute_covariances {
1346                    problem.compute_and_set_covariances(
1347                        &mut linear_solver,
1348                        &mut state.variables,
1349                        &state.variable_index_map,
1350                    )
1351                } else {
1352                    None
1353                };
1354
1355                return Ok(SolverResult {
1356                    status,
1357                    iterations: iteration + 1,
1358                    initial_cost: state.initial_cost,
1359                    final_cost: state.current_cost,
1360                    parameters: state.variables.into_iter().collect(),
1361                    elapsed_time: elapsed,
1362                    convergence_info: Some(ConvergenceInfo {
1363                        final_gradient_norm,
1364                        final_parameter_update_norm,
1365                        cost_evaluations,
1366                        jacobian_evaluations,
1367                    }),
1368                    covariances,
1369                });
1370            }
1371
1372            // Note: Max iterations and timeout checks are now handled inside check_convergence()
1373
1374            iteration += 1;
1375        }
1376    }
1377}
1378// Implement Solver trait
1379impl Solver for LevenbergMarquardt {
1380    type Config = LevenbergMarquardtConfig;
1381    type Error = error::ApexError;
1382
1383    fn new() -> Self {
1384        Self::default()
1385    }
1386
1387    fn optimize(
1388        &mut self,
1389        problem: &Problem,
1390        initial_params: &HashMap<String, (ManifoldType, DVector<f64>)>,
1391    ) -> Result<SolverResult<HashMap<String, VariableEnum>>, Self::Error> {
1392        self.optimize(problem, initial_params)
1393    }
1394}
1395
1396#[cfg(test)]
1397mod tests {
1398    use super::*;
1399    use crate::factors::Factor;
1400    use nalgebra::{DMatrix, dvector};
1401    /// Custom Rosenbrock Factor 1: r1 = 10(x2 - x1²)
1402    /// Demonstrates extensibility - custom factors can be defined outside of factors.rs
1403    #[derive(Debug, Clone)]
1404    struct RosenbrockFactor1;
1405
1406    impl Factor for RosenbrockFactor1 {
1407        fn linearize(
1408            &self,
1409            params: &[DVector<f64>],
1410            compute_jacobian: bool,
1411        ) -> (DVector<f64>, Option<DMatrix<f64>>) {
1412            let x1 = params[0][0];
1413            let x2 = params[1][0];
1414
1415            // Residual: r1 = 10(x2 - x1²)
1416            let residual = dvector![10.0 * (x2 - x1 * x1)];
1417
1418            let jacobian = if compute_jacobian {
1419                // Jacobian: ∂r1/∂x1 = -20*x1, ∂r1/∂x2 = 10
1420                let mut jacobian = DMatrix::zeros(1, 2);
1421                jacobian[(0, 0)] = -20.0 * x1;
1422                jacobian[(0, 1)] = 10.0;
1423
1424                Some(jacobian)
1425            } else {
1426                None
1427            };
1428
1429            (residual, jacobian)
1430        }
1431
1432        fn get_dimension(&self) -> usize {
1433            1
1434        }
1435    }
1436
1437    /// Custom Rosenbrock Factor 2: r2 = 1 - x1
1438    /// Demonstrates extensibility - custom factors can be defined outside of factors.rs
1439    #[derive(Debug, Clone)]
1440    struct RosenbrockFactor2;
1441
1442    impl Factor for RosenbrockFactor2 {
1443        fn linearize(
1444            &self,
1445            params: &[DVector<f64>],
1446            compute_jacobian: bool,
1447        ) -> (DVector<f64>, Option<DMatrix<f64>>) {
1448            let x1 = params[0][0];
1449
1450            // Residual: r2 = 1 - x1
1451            let residual = dvector![1.0 - x1];
1452
1453            let jacobian = if compute_jacobian {
1454                // Jacobian: ∂r2/∂x1 = -1
1455                Some(DMatrix::from_element(1, 1, -1.0))
1456            } else {
1457                None
1458            };
1459
1460            (residual, jacobian)
1461        }
1462
1463        fn get_dimension(&self) -> usize {
1464            1
1465        }
1466    }
1467
1468    #[test]
1469    fn test_rosenbrock_optimization() {
1470        // Rosenbrock function test:
1471        // Minimize: r1² + r2² where
1472        //   r1 = 10(x2 - x1²)
1473        //   r2 = 1 - x1
1474        // Starting point: [-1.2, 1.0]
1475        // Expected minimum: [1.0, 1.0]
1476
1477        let mut problem = Problem::new();
1478        let mut initial_values = HashMap::new();
1479
1480        // Add variables using Rn manifold (Euclidean space)
1481        initial_values.insert("x1".to_string(), (ManifoldType::RN, dvector![-1.2]));
1482        initial_values.insert("x2".to_string(), (ManifoldType::RN, dvector![1.0]));
1483
1484        // Add custom factors (demonstrates extensibility!)
1485        problem.add_residual_block(&["x1", "x2"], Box::new(RosenbrockFactor1), None);
1486        problem.add_residual_block(&["x1"], Box::new(RosenbrockFactor2), None);
1487
1488        // Configure Levenberg-Marquardt optimizer
1489        let config = LevenbergMarquardtConfig::new()
1490            .with_max_iterations(100)
1491            .with_cost_tolerance(1e-8)
1492            .with_parameter_tolerance(1e-8)
1493            .with_gradient_tolerance(1e-10);
1494
1495        let mut solver = LevenbergMarquardt::with_config(config);
1496        let result = solver.optimize(&problem, &initial_values).unwrap();
1497
1498        // Extract final values
1499        let x1_final = result.parameters.get("x1").unwrap().to_vector()[0];
1500        let x2_final = result.parameters.get("x2").unwrap().to_vector()[0];
1501
1502        println!("Rosenbrock optimization result:");
1503        println!("  Status: {:?}", result.status);
1504        println!("  Initial cost: {:.6e}", result.initial_cost);
1505        println!("  Final cost: {:.6e}", result.final_cost);
1506        println!("  Iterations: {}", result.iterations);
1507        println!("  x1: {:.6} (expected 1.0)", x1_final);
1508        println!("  x2: {:.6} (expected 1.0)", x2_final);
1509
1510        // Verify convergence to [1.0, 1.0]
1511        assert!(
1512            matches!(
1513                result.status,
1514                OptimizationStatus::Converged
1515                    | OptimizationStatus::CostToleranceReached
1516                    | OptimizationStatus::ParameterToleranceReached
1517                    | OptimizationStatus::GradientToleranceReached
1518            ),
1519            "Optimization should converge"
1520        );
1521        assert!(
1522            (x1_final - 1.0).abs() < 1e-4,
1523            "x1 should converge to 1.0, got {}",
1524            x1_final
1525        );
1526        assert!(
1527            (x2_final - 1.0).abs() < 1e-4,
1528            "x2 should converge to 1.0, got {}",
1529            x2_final
1530        );
1531        assert!(
1532            result.final_cost < 1e-6,
1533            "Final cost should be near zero, got {}",
1534            result.final_cost
1535        );
1536    }
1537}