apex_solver/optimizer/
levenberg_marquardt.rs

1//! Levenberg-Marquardt algorithm implementation.
2//!
3//! The Levenberg-Marquardt (LM) method is a robust and widely-used algorithm for solving
4//! nonlinear least squares problems of the form:
5//!
6//! ```text
7//! min f(x) = ½||r(x)||² = ½Σᵢ rᵢ(x)²
8//! ```
9//!
10//! where `r: ℝⁿ → ℝᵐ` is the residual vector function.
11//!
12//! # Algorithm Overview
13//!
14//! The Levenberg-Marquardt method solves the damped normal equations at each iteration:
15//!
16//! ```text
17//! (J^T·J + λI)·h = -J^T·r
18//! ```
19//!
20//! where:
21//! - `J` is the Jacobian matrix (m × n)
22//! - `r` is the residual vector (m × 1)
23//! - `h` is the step vector (n × 1)
24//! - `λ` is the adaptive damping parameter (scalar)
25//! - `I` is the identity matrix (or diagonal scaling matrix)
26//!
27//! ## Damping Parameter Strategy
28//!
29//! The damping parameter λ adapts based on step quality:
30//!
31//! - **λ → 0** (small damping): Behaves like Gauss-Newton with fast quadratic convergence
32//! - **λ → ∞** (large damping): Behaves like gradient descent with guaranteed descent direction
33//!
34//! This interpolation between Newton and gradient descent provides excellent robustness
35//! while maintaining fast convergence near the solution.
36//!
37//! ## Step Acceptance and Damping Update
38//!
39//! The algorithm evaluates each proposed step using the gain ratio:
40//!
41//! ```text
42//! ρ = (actual reduction) / (predicted reduction)
43//!   = [f(xₖ) - f(xₖ + h)] / [f(xₖ) - L(h)]
44//! ```
45//!
46//! where `L(h) = f(xₖ) + h^T·g + ½h^T·H·h` is the local quadratic model.
47//!
48//! **Step acceptance:**
49//! - If `ρ > 0`: Accept step (cost decreased), decrease λ to trust the model more
50//! - If `ρ ≤ 0`: Reject step (cost increased), increase λ to be more conservative
51//!
52//! **Damping update** (Nielsen's formula):
53//! ```text
54//! λₖ₊₁ = λₖ · max(1/3, 1 - (2ρ - 1)³)
55//! ```
56//!
57//! This provides smooth, data-driven adaptation of the damping parameter.
58//!
59//! ## Convergence Properties
60//!
61//! - **Global convergence**: Guaranteed to find a stationary point from any starting guess
62//! - **Local quadratic convergence**: Near the solution, behaves like Gauss-Newton
63//! - **Robust to poor initialization**: Adaptive damping prevents divergence
64//! - **Handles ill-conditioning**: Large λ stabilizes nearly singular Hessian
65//!
66//! ## When to Use
67//!
68//! Levenberg-Marquardt is the best general-purpose choice when:
69//! - Initial parameter guess may be far from the optimum
70//! - Problem conditioning is unknown
71//! - Robustness is prioritized over raw speed
72//! - You want reliable convergence across diverse problem types
73//!
74//! For problems with specific structure, consider:
75//! - [`GaussNewton`](crate::GaussNewton) if well-conditioned with good initialization
76//! - [`DogLeg`](crate::DogLeg) for explicit trust region control
77//!
78//! # Implementation Features
79//!
80//! - **Sparse matrix support**: Efficient handling of large-scale problems via `faer` sparse library
81//! - **Adaptive damping**: Nielsen's formula for smooth parameter adaptation
82//! - **Robust linear solvers**: Cholesky (fast) or QR (stable) factorization
83//! - **Jacobi scaling**: Optional diagonal preconditioning for mixed-scale problems
84//! - **Covariance computation**: Optional uncertainty quantification after convergence
85//! - **Manifold operations**: Native support for optimization on Lie groups (SE2, SE3, SO2, SO3)
86//! - **Comprehensive diagnostics**: Detailed summaries of convergence and performance
87//!
88//! # Mathematical Background
89//!
90//! The augmented Hessian `J^T·J + λI` combines two beneficial properties:
91//!
92//! 1. **Positive definiteness**: Always solvable even when `J^T·J` is singular
93//! 2. **Regularization**: Prevents taking steps in poorly-determined directions
94//!
95//! The trust region interpretation: λ controls an implicit spherical trust region where
96//! larger λ restricts step size, ensuring the linear model remains valid.
97//!
98//! # Examples
99//!
100//! ## Basic usage
101//!
102//! ```no_run
103//! use apex_solver::LevenbergMarquardt;
104//! use apex_solver::core::problem::Problem;
105//! use std::collections::HashMap;
106//!
107//! # fn main() -> Result<(), Box<dyn std::error::Error>> {
108//! let mut problem = Problem::new();
109//! // ... add residual blocks (factors) to problem ...
110//!
111//! let initial_values = HashMap::new();
112//! // ... initialize parameters ...
113//!
114//! let mut solver = LevenbergMarquardt::new();
115//! let result = solver.optimize(&problem, &initial_values)?;
116//!
117//! # Ok(())
118//! # }
119//! ```
120//!
121//! ## Advanced configuration
122//!
123//! ```no_run
124//! use apex_solver::optimizer::levenberg_marquardt::{LevenbergMarquardtConfig, LevenbergMarquardt};
125//! use apex_solver::linalg::LinearSolverType;
126//!
127//! # fn main() {
128//! let config = LevenbergMarquardtConfig::new()
129//!     .with_max_iterations(100)
130//!     .with_cost_tolerance(1e-6)
131//!     .with_damping(1e-3)  // Initial damping
132//!     .with_damping_bounds(1e-12, 1e12)  // Min/max damping
133//!     .with_jacobi_scaling(true);  // Improve conditioning
134//!
135//! let mut solver = LevenbergMarquardt::with_config(config);
136//! # }
137//! ```
138//!
139//! # References
140//!
141//! - Levenberg, K. (1944). "A Method for the Solution of Certain Non-Linear Problems in Least Squares". *Quarterly of Applied Mathematics*.
142//! - Marquardt, D. W. (1963). "An Algorithm for Least-Squares Estimation of Nonlinear Parameters". *Journal of the Society for Industrial and Applied Mathematics*.
143//! - Madsen, K., Nielsen, H. B., & Tingleff, O. (2004). *Methods for Non-Linear Least Squares Problems* (2nd ed.). Chapter 3.
144//! - Nocedal, J. & Wright, S. (2006). *Numerical Optimization* (2nd ed.). Springer. Chapter 10.
145//! - Nielsen, H. B. (1999). "Damping Parameter in Marquardt's Method". Technical Report IMM-REP-1999-05.
146
147use crate::core::problem::{Problem, SymbolicStructure, VariableEnum};
148use crate::error;
149use crate::linalg::{LinearSolverType, SparseCholeskySolver, SparseLinearSolver, SparseQRSolver};
150use crate::manifold::ManifoldType;
151use crate::optimizer::{
152    ConvergenceInfo, OptObserverVec, OptimizationStatus, OptimizerError, Solver, SolverResult,
153    apply_negative_parameter_step, apply_parameter_step, compute_cost,
154};
155
156use faer::{
157    Mat,
158    sparse::{SparseColMat, Triplet},
159};
160use nalgebra::DVector;
161use std::collections::HashMap;
162use std::{
163    fmt,
164    fmt::{Display, Formatter},
165    time::{Duration, Instant},
166};
167use tracing::debug;
168
169/// Summary statistics for the Levenberg-Marquardt optimization process.
170#[derive(Debug, Clone)]
171pub struct LevenbergMarquardtSummary {
172    /// Initial cost value
173    pub initial_cost: f64,
174    /// Final cost value
175    pub final_cost: f64,
176    /// Ratio of actual to predicted reduction in cost
177    pub rho: f64,
178    /// Total number of iterations performed
179    pub iterations: usize,
180    /// Number of successful steps (cost decreased)
181    pub successful_steps: usize,
182    /// Number of unsuccessful steps (cost increased, damping increased)
183    pub unsuccessful_steps: usize,
184    /// Final damping parameter value
185    pub final_damping: f64,
186    /// Average cost reduction per iteration
187    pub average_cost_reduction: f64,
188    /// Maximum gradient norm encountered
189    pub max_gradient_norm: f64,
190    /// Final gradient norm
191    pub final_gradient_norm: f64,
192    /// Maximum parameter update norm
193    pub max_parameter_update_norm: f64,
194    /// Final parameter update norm
195    pub final_parameter_update_norm: f64,
196    /// Total time elapsed
197    pub total_time: Duration,
198    /// Average time per iteration
199    pub average_time_per_iteration: Duration,
200    /// Detailed per-iteration statistics history
201    pub iteration_history: Vec<IterationStats>,
202    /// Convergence status
203    pub convergence_status: OptimizationStatus,
204}
205
206impl Display for LevenbergMarquardtSummary {
207    fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
208        // Determine if converged
209        let converged = matches!(
210            self.convergence_status,
211            OptimizationStatus::Converged
212                | OptimizationStatus::CostToleranceReached
213                | OptimizationStatus::GradientToleranceReached
214                | OptimizationStatus::ParameterToleranceReached
215        );
216
217        writeln!(f, "Levenberg-Marquardt Final Result")?;
218
219        // Title with convergence status
220        if converged {
221            writeln!(f, "CONVERGED ({:?})", self.convergence_status)?;
222        } else {
223            writeln!(f, "DIVERGED ({:?})", self.convergence_status)?;
224        }
225
226        writeln!(f)?;
227        writeln!(f, "Cost:")?;
228        writeln!(f, "  Initial:   {:.6e}", self.initial_cost)?;
229        writeln!(f, "  Final:     {:.6e}", self.final_cost)?;
230        writeln!(
231            f,
232            "  Reduction: {:.6e} ({:.2}%)",
233            self.initial_cost - self.final_cost,
234            100.0 * (self.initial_cost - self.final_cost) / self.initial_cost.max(1e-12)
235        )?;
236        writeln!(f)?;
237        writeln!(f, "Iterations:")?;
238        writeln!(f, "  Total:              {}", self.iterations)?;
239        writeln!(
240            f,
241            "  Successful steps:   {} ({:.1}%)",
242            self.successful_steps,
243            100.0 * self.successful_steps as f64 / self.iterations.max(1) as f64
244        )?;
245        writeln!(
246            f,
247            "  Unsuccessful steps: {} ({:.1}%)",
248            self.unsuccessful_steps,
249            100.0 * self.unsuccessful_steps as f64 / self.iterations.max(1) as f64
250        )?;
251        writeln!(f)?;
252        writeln!(f, "Gradient:")?;
253        writeln!(f, "  Max norm:   {:.2e}", self.max_gradient_norm)?;
254        writeln!(f, "  Final norm: {:.2e}", self.final_gradient_norm)?;
255        writeln!(f)?;
256        writeln!(f, "Parameter Update:")?;
257        writeln!(f, "  Max norm:   {:.2e}", self.max_parameter_update_norm)?;
258        writeln!(f, "  Final norm: {:.2e}", self.final_parameter_update_norm)?;
259        writeln!(f)?;
260        writeln!(f, "Performance:")?;
261        writeln!(
262            f,
263            "  Total time:             {:.2}ms",
264            self.total_time.as_secs_f64() * 1000.0
265        )?;
266        writeln!(
267            f,
268            "  Average per iteration:  {:.2}ms",
269            self.average_time_per_iteration.as_secs_f64() * 1000.0
270        )?;
271
272        Ok(())
273    }
274}
275
276/// Per-iteration statistics for detailed logging (Ceres-style output).
277///
278/// Captures all relevant metrics for each optimization iteration, enabling
279/// detailed analysis and debugging of the optimization process.
280#[derive(Debug, Clone)]
281pub struct IterationStats {
282    /// Iteration number (0-indexed)
283    pub iteration: usize,
284    /// Cost function value at this iteration
285    pub cost: f64,
286    /// Change in cost from previous iteration
287    pub cost_change: f64,
288    /// L2 norm of the gradient (||J^T·r||)
289    pub gradient_norm: f64,
290    /// L2 norm of the parameter update step (||Δx||)
291    pub step_norm: f64,
292    /// Trust region ratio (ρ = actual_reduction / predicted_reduction)
293    pub tr_ratio: f64,
294    /// Trust region radius (damping parameter λ)
295    pub tr_radius: f64,
296    /// Linear solver iterations (0 for direct solvers like Cholesky)
297    pub ls_iter: usize,
298    /// Time taken for this iteration in milliseconds
299    pub iter_time_ms: f64,
300    /// Total elapsed time since optimization started in milliseconds
301    pub total_time_ms: f64,
302    /// Whether the step was accepted (true) or rejected (false)
303    pub accepted: bool,
304}
305
306impl IterationStats {
307    /// Print table header in Ceres-style format
308    pub fn print_header() {
309        debug!(
310            "{:>4}  {:>13}  {:>13}  {:>13}  {:>13}  {:>11}  {:>11}  {:>7}  {:>11}  {:>13}  {:>6}",
311            "iter",
312            "cost",
313            "cost_change",
314            "|gradient|",
315            "|step|",
316            "tr_ratio",
317            "tr_radius",
318            "ls_iter",
319            "iter_time",
320            "total_time",
321            "status"
322        );
323    }
324
325    /// Print single iteration line in Ceres-style format with scientific notation
326    pub fn print_line(&self) {
327        let status = if self.iteration == 0 {
328            "-"
329        } else if self.accepted {
330            "✓"
331        } else {
332            "✗"
333        };
334
335        debug!(
336            "{:>4}  {:>13.6e}  {:>13.2e}  {:>13.2e}  {:>13.2e}  {:>11.2e}  {:>11.2e}  {:>7}  {:>9.2}ms  {:>11.2}ms  {:>6}",
337            self.iteration,
338            self.cost,
339            self.cost_change,
340            self.gradient_norm,
341            self.step_norm,
342            self.tr_ratio,
343            self.tr_radius,
344            self.ls_iter,
345            self.iter_time_ms,
346            self.total_time_ms,
347            status
348        );
349    }
350}
351
352/// Configuration parameters for the Levenberg-Marquardt optimizer.
353///
354/// Controls the adaptive damping strategy, convergence criteria, and numerical stability
355/// enhancements for the Levenberg-Marquardt algorithm.
356///
357/// # Builder Pattern
358///
359/// All configuration options can be set using the builder pattern:
360///
361/// ```
362/// use apex_solver::optimizer::levenberg_marquardt::LevenbergMarquardtConfig;
363///
364/// let config = LevenbergMarquardtConfig::new()
365///     .with_max_iterations(100)
366///     .with_damping(1e-3)
367///     .with_damping_bounds(1e-12, 1e12)
368///     .with_jacobi_scaling(true);
369/// ```
370///
371/// # Damping Parameter Behavior
372///
373/// The damping parameter λ controls the trade-off between Gauss-Newton and gradient descent:
374///
375/// - **Initial damping** (`damping`): Starting value (default: 1e-4)
376/// - **Damping bounds** (`damping_min`, `damping_max`): Valid range (default: 1e-12 to 1e12)
377/// - **Adaptation**: Automatically adjusted based on step quality using Nielsen's formula
378///
379/// # Convergence Criteria
380///
381/// The optimizer terminates when ANY of the following conditions is met:
382///
383/// - **Cost tolerance**: `|cost_k - cost_{k-1}| < cost_tolerance`
384/// - **Parameter tolerance**: `||step|| < parameter_tolerance`
385/// - **Gradient tolerance**: `||J^T·r|| < gradient_tolerance`
386/// - **Maximum iterations**: `iteration >= max_iterations`
387/// - **Timeout**: `elapsed_time >= timeout`
388///
389/// # See Also
390///
391/// - [`LevenbergMarquardt`] - The solver that uses this configuration
392/// - [`GaussNewtonConfig`](crate::GaussNewtonConfig) - Undamped variant
393/// - [`DogLegConfig`](crate::DogLegConfig) - Trust region alternative
394#[derive(Clone)]
395pub struct LevenbergMarquardtConfig {
396    /// Type of linear solver for the linear systems
397    pub linear_solver_type: LinearSolverType,
398    /// Maximum number of iterations
399    pub max_iterations: usize,
400    /// Convergence tolerance for cost function
401    pub cost_tolerance: f64,
402    /// Convergence tolerance for parameter updates
403    pub parameter_tolerance: f64,
404    /// Convergence tolerance for gradient norm
405    pub gradient_tolerance: f64,
406    /// Timeout duration
407    pub timeout: Option<Duration>,
408    /// Initial damping parameter
409    pub damping: f64,
410    /// Minimum damping parameter
411    pub damping_min: f64,
412    /// Maximum damping parameter
413    pub damping_max: f64,
414    /// Damping increase factor (when step rejected)
415    pub damping_increase_factor: f64,
416    /// Damping decrease factor (when step accepted)
417    pub damping_decrease_factor: f64,
418    /// Damping nu parameter
419    pub damping_nu: f64,
420    /// Trust region radius
421    pub trust_region_radius: f64,
422    /// Minimum step quality for acceptance
423    pub min_step_quality: f64,
424    /// Good step quality threshold
425    pub good_step_quality: f64,
426    /// Minimum diagonal value for regularization
427    pub min_diagonal: f64,
428    /// Maximum diagonal value for regularization
429    pub max_diagonal: f64,
430    /// Minimum objective function cutoff (optional early termination)
431    ///
432    /// If set, optimization terminates when cost falls below this threshold.
433    /// Useful for early stopping when a "good enough" solution is acceptable.
434    ///
435    /// Default: None (disabled)
436    pub min_cost_threshold: Option<f64>,
437    /// Minimum trust region radius before termination
438    ///
439    /// When the trust region radius falls below this value, the optimizer
440    /// terminates as it indicates the search has converged or the problem
441    /// is ill-conditioned. Matches Ceres Solver's min_trust_region_radius.
442    ///
443    /// Default: 1e-32 (Ceres-compatible)
444    pub min_trust_region_radius: f64,
445    /// Maximum condition number for Jacobian matrix (optional check)
446    ///
447    /// If set, the optimizer checks if condition_number(J^T*J) exceeds this
448    /// threshold and terminates with IllConditionedJacobian status.
449    /// Note: Computing condition number is expensive, so this is disabled by default.
450    ///
451    /// Default: None (disabled)
452    pub max_condition_number: Option<f64>,
453    /// Minimum relative cost decrease for step acceptance
454    ///
455    /// Used in computing step quality (rho = actual_reduction / predicted_reduction).
456    /// Steps with rho < min_relative_decrease are rejected. Matches Ceres Solver's
457    /// min_relative_decrease parameter.
458    ///
459    /// Default: 1e-3 (Ceres-compatible)
460    pub min_relative_decrease: f64,
461    /// Use Jacobi column scaling (preconditioning)
462    ///
463    /// When enabled, normalizes Jacobian columns by their L2 norm before solving.
464    /// This can improve convergence for problems with mixed parameter scales
465    /// (e.g., positions in meters + angles in radians) but adds ~5-10% overhead.
466    ///
467    /// Default: false (to avoid performance overhead and faster convergence)
468    pub use_jacobi_scaling: bool,
469    /// Compute per-variable covariance matrices (uncertainty estimation)
470    ///
471    /// When enabled, computes covariance by inverting the Hessian matrix after
472    /// convergence. The full covariance matrix is extracted into per-variable
473    /// blocks stored in both Variable structs and SolverResult.
474    ///
475    /// Default: false (to avoid performance overhead)
476    pub compute_covariances: bool,
477    // Note: Visualization is now handled via the observer pattern.
478    // Use `solver.add_observer(RerunObserver::new(true)?)` to enable visualization.
479    // This provides cleaner separation of concerns and allows multiple observers.
480}
481
482impl Default for LevenbergMarquardtConfig {
483    fn default() -> Self {
484        Self {
485            linear_solver_type: LinearSolverType::default(),
486            // Ceres Solver default: 50 (changed from 100 for compatibility)
487            max_iterations: 50,
488            // Ceres Solver default: 1e-6 (changed from 1e-8 for compatibility)
489            cost_tolerance: 1e-6,
490            // Ceres Solver default: 1e-8 (unchanged)
491            parameter_tolerance: 1e-8,
492            // Ceres Solver default: 1e-10 (changed from 1e-8 for compatibility)
493            // Note: Typically should be 1e-4 * cost_tolerance per Ceres docs
494            gradient_tolerance: 1e-10,
495            timeout: None,
496            damping: 1e-4,
497            damping_min: 1e-12,
498            damping_max: 1e12,
499            damping_increase_factor: 10.0,
500            damping_decrease_factor: 0.3,
501            damping_nu: 2.0,
502            trust_region_radius: 1e4,
503            min_step_quality: 0.0,
504            good_step_quality: 0.75,
505            min_diagonal: 1e-6,
506            max_diagonal: 1e32,
507            // New Ceres-compatible parameters
508            min_cost_threshold: None,
509            min_trust_region_radius: 1e-32,
510            max_condition_number: None,
511            min_relative_decrease: 1e-3,
512            // Existing parameters
513            use_jacobi_scaling: false,
514            compute_covariances: false,
515        }
516    }
517}
518
519impl LevenbergMarquardtConfig {
520    /// Create a new Levenberg-Marquardt configuration with default values.
521    pub fn new() -> Self {
522        Self::default()
523    }
524
525    /// Set the linear solver type
526    pub fn with_linear_solver_type(mut self, linear_solver_type: LinearSolverType) -> Self {
527        self.linear_solver_type = linear_solver_type;
528        self
529    }
530
531    /// Set the maximum number of iterations
532    pub fn with_max_iterations(mut self, max_iterations: usize) -> Self {
533        self.max_iterations = max_iterations;
534        self
535    }
536
537    /// Set the cost tolerance
538    pub fn with_cost_tolerance(mut self, cost_tolerance: f64) -> Self {
539        self.cost_tolerance = cost_tolerance;
540        self
541    }
542
543    /// Set the parameter tolerance
544    pub fn with_parameter_tolerance(mut self, parameter_tolerance: f64) -> Self {
545        self.parameter_tolerance = parameter_tolerance;
546        self
547    }
548
549    /// Set the gradient tolerance
550    pub fn with_gradient_tolerance(mut self, gradient_tolerance: f64) -> Self {
551        self.gradient_tolerance = gradient_tolerance;
552        self
553    }
554
555    /// Set the timeout duration
556    pub fn with_timeout(mut self, timeout: Duration) -> Self {
557        self.timeout = Some(timeout);
558        self
559    }
560
561    /// Set the initial damping parameter.
562    pub fn with_damping(mut self, damping: f64) -> Self {
563        self.damping = damping;
564        self
565    }
566
567    /// Set the damping parameter bounds.
568    pub fn with_damping_bounds(mut self, min: f64, max: f64) -> Self {
569        self.damping_min = min;
570        self.damping_max = max;
571        self
572    }
573
574    /// Set the damping adjustment factors.
575    pub fn with_damping_factors(mut self, increase: f64, decrease: f64) -> Self {
576        self.damping_increase_factor = increase;
577        self.damping_decrease_factor = decrease;
578        self
579    }
580
581    /// Set the trust region parameters.
582    pub fn with_trust_region(mut self, radius: f64, min_quality: f64, good_quality: f64) -> Self {
583        self.trust_region_radius = radius;
584        self.min_step_quality = min_quality;
585        self.good_step_quality = good_quality;
586        self
587    }
588
589    /// Set minimum objective function cutoff for early termination.
590    ///
591    /// When set, optimization terminates with MinCostThresholdReached status
592    /// if the cost falls below this threshold. Useful for early stopping when
593    /// a "good enough" solution is acceptable.
594    pub fn with_min_cost_threshold(mut self, min_cost: f64) -> Self {
595        self.min_cost_threshold = Some(min_cost);
596        self
597    }
598
599    /// Set minimum trust region radius before termination.
600    ///
601    /// When the trust region radius falls below this value, optimization
602    /// terminates with TrustRegionRadiusTooSmall status.
603    /// Default: 1e-32 (Ceres-compatible)
604    pub fn with_min_trust_region_radius(mut self, min_radius: f64) -> Self {
605        self.min_trust_region_radius = min_radius;
606        self
607    }
608
609    /// Set maximum condition number for Jacobian matrix.
610    ///
611    /// If set, the optimizer checks if condition_number(J^T*J) exceeds this
612    /// threshold and terminates with IllConditionedJacobian status.
613    /// Note: Computing condition number is expensive, disabled by default.
614    pub fn with_max_condition_number(mut self, max_cond: f64) -> Self {
615        self.max_condition_number = Some(max_cond);
616        self
617    }
618
619    /// Set minimum relative cost decrease for step acceptance.
620    ///
621    /// Steps with rho = (actual_reduction / predicted_reduction) below this
622    /// threshold are rejected. Default: 1e-3 (Ceres-compatible)
623    pub fn with_min_relative_decrease(mut self, min_decrease: f64) -> Self {
624        self.min_relative_decrease = min_decrease;
625        self
626    }
627
628    /// Enable or disable Jacobi column scaling (preconditioning).
629    ///
630    /// When enabled, normalizes Jacobian columns by their L2 norm before solving.
631    /// Can improve convergence for mixed-scale problems but adds ~5-10% overhead.
632    pub fn with_jacobi_scaling(mut self, use_jacobi_scaling: bool) -> Self {
633        self.use_jacobi_scaling = use_jacobi_scaling;
634        self
635    }
636
637    /// Enable or disable covariance computation (uncertainty estimation).
638    ///
639    /// When enabled, computes the full covariance matrix by inverting the Hessian
640    /// after convergence, then extracts per-variable covariance blocks.
641    pub fn with_compute_covariances(mut self, compute_covariances: bool) -> Self {
642        self.compute_covariances = compute_covariances;
643        self
644    }
645
646    /// Enable real-time visualization (graphical debugging).
647    ///
648    /// When enabled, optimization progress is logged to a Rerun viewer with:
649    /// - Time series plots of cost, gradient norm, damping, step quality
650    /// - Sparse Hessian matrix visualization as heat map
651    /// - Gradient vector visualization
652    /// - Real-time manifold state updates (for SE2/SE3 problems)
653    ///
654    /// **Note:** Requires the `visualization` feature to be enabled in `Cargo.toml`.
655    /// Use `verbose` for terminal logging.
656    ///
657    /// # Arguments
658    ///
659    /// * `enable` - Whether to enable visualization
660    // Note: with_visualization() method has been removed.
661    // Use the observer pattern instead:
662    //   let mut solver = LevenbergMarquardt::with_config(config);
663    //   solver.add_observer(RerunObserver::new(true)?);
664    // This provides cleaner separation and allows multiple observers.
665    ///   Print configuration parameters (verbose mode only)
666    pub fn print_configuration(&self) {
667        debug!(
668            "Configuration:\n  Solver:        Levenberg-Marquardt\n  Linear solver: {:?}\n  Convergence Criteria:\n  Max iterations:      {}\n  Cost tolerance:      {:.2e}\n  Parameter tolerance: {:.2e}\n  Gradient tolerance:  {:.2e}\n  Timeout:             {:?}\n  Damping Parameters:\n  Initial damping:     {:.2e}\n  Damping range:       [{:.2e}, {:.2e}]\n  Increase factor:     {:.2}\n  Decrease factor:     {:.2}\n  Trust Region:\n  Initial radius:      {:.2e}\n  Min step quality:    {:.2}\n  Good step quality:   {:.2}\n  Numerical Settings:\n  Jacobi scaling:      {}\n  Compute covariances: {}",
669            self.linear_solver_type,
670            self.max_iterations,
671            self.cost_tolerance,
672            self.parameter_tolerance,
673            self.gradient_tolerance,
674            self.timeout,
675            self.damping,
676            self.damping_min,
677            self.damping_max,
678            self.damping_increase_factor,
679            self.damping_decrease_factor,
680            self.trust_region_radius,
681            self.min_step_quality,
682            self.good_step_quality,
683            if self.use_jacobi_scaling {
684                "enabled"
685            } else {
686                "disabled"
687            },
688            if self.compute_covariances {
689                "enabled"
690            } else {
691                "disabled"
692            }
693        );
694    }
695}
696
697/// State for optimization iteration
698struct LinearizerResult {
699    variables: HashMap<String, VariableEnum>,
700    variable_index_map: HashMap<String, usize>,
701    sorted_vars: Vec<String>,
702    symbolic_structure: SymbolicStructure,
703    current_cost: f64,
704    initial_cost: f64,
705}
706
707/// Result from step computation
708struct StepResult {
709    step: Mat<f64>,
710    gradient_norm: f64,
711    predicted_reduction: f64,
712}
713
714/// Result from step evaluation
715struct StepEvaluation {
716    accepted: bool,
717    cost_reduction: f64,
718    rho: f64,
719}
720
721/// Levenberg-Marquardt solver for nonlinear least squares optimization.
722///
723/// Implements the damped Gauss-Newton method with adaptive damping parameter λ that
724/// interpolates between Gauss-Newton and gradient descent based on step quality.
725///
726/// # Algorithm
727///
728/// At each iteration k:
729/// 1. Compute residual `r(xₖ)` and Jacobian `J(xₖ)`
730/// 2. Solve augmented system: `(J^T·J + λI)·h = -J^T·r`
731/// 3. Evaluate step quality: `ρ = (actual reduction) / (predicted reduction)`
732/// 4. If `ρ > 0`: Accept step and update `xₖ₊₁ = xₖ ⊕ h`, decrease λ
733/// 5. If `ρ ≤ 0`: Reject step (keep `xₖ₊₁ = xₖ`), increase λ
734/// 6. Check convergence criteria
735///
736/// The damping parameter λ is updated using Nielsen's smooth formula:
737/// `λₖ₊₁ = λₖ · max(1/3, 1 - (2ρ - 1)³)` for accepted steps,
738/// or `λₖ₊₁ = λₖ · ν` (with increasing ν) for rejected steps.
739///
740/// # Examples
741///
742/// ```no_run
743/// use apex_solver::optimizer::levenberg_marquardt::{LevenbergMarquardtConfig, LevenbergMarquardt};
744/// use apex_solver::linalg::LinearSolverType;
745///
746/// # fn main() {
747/// let config = LevenbergMarquardtConfig::new()
748///     .with_max_iterations(100)
749///     .with_damping(1e-3)
750///     .with_damping_bounds(1e-12, 1e12)
751///     .with_jacobi_scaling(true);
752///
753/// let mut solver = LevenbergMarquardt::with_config(config);
754/// # }
755/// ```
756///
757/// # See Also
758///
759/// - [`LevenbergMarquardtConfig`] - Configuration options
760/// - [`GaussNewton`](crate::GaussNewton) - Undamped variant (faster but less robust)
761/// - [`DogLeg`](crate::DogLeg) - Alternative trust region method
762pub struct LevenbergMarquardt {
763    config: LevenbergMarquardtConfig,
764    jacobi_scaling: Option<SparseColMat<usize, f64>>,
765    observers: OptObserverVec,
766}
767
768impl Default for LevenbergMarquardt {
769    fn default() -> Self {
770        Self::new()
771    }
772}
773
774impl LevenbergMarquardt {
775    /// Create a new Levenberg-Marquardt solver with default configuration.
776    pub fn new() -> Self {
777        Self::with_config(LevenbergMarquardtConfig::default())
778    }
779
780    /// Create a new Levenberg-Marquardt solver with the given configuration.
781    pub fn with_config(config: LevenbergMarquardtConfig) -> Self {
782        Self {
783            config,
784            jacobi_scaling: None,
785            observers: OptObserverVec::new(),
786        }
787    }
788
789    /// Add an observer to monitor optimization progress.
790    ///
791    /// Observers are notified at each iteration with the current variable values.
792    /// This enables real-time visualization, logging, metrics collection, etc.
793    ///
794    /// # Arguments
795    ///
796    /// * `observer` - Any type implementing `OptObserver`
797    ///
798    /// # Examples
799    ///
800    /// ```no_run
801    /// use apex_solver::{LevenbergMarquardt, LevenbergMarquardtConfig};
802    /// # use apex_solver::core::problem::Problem;
803    /// # use std::collections::HashMap;
804    ///
805    /// # fn main() -> Result<(), Box<dyn std::error::Error>> {
806    /// let mut solver = LevenbergMarquardt::new();
807    ///
808    /// #[cfg(feature = "visualization")]
809    /// {
810    ///     use apex_solver::observers::RerunObserver;
811    ///     let rerun_observer = RerunObserver::new(true)?;
812    ///     solver.add_observer(rerun_observer);
813    /// }
814    ///
815    /// // ... optimize ...
816    /// # Ok(())
817    /// # }
818    /// ```
819    pub fn add_observer(&mut self, observer: impl crate::optimizer::OptObserver + 'static) {
820        self.observers.add(observer);
821    }
822
823    /// Create the appropriate linear solver based on configuration
824    fn create_linear_solver(&self) -> Box<dyn SparseLinearSolver> {
825        match self.config.linear_solver_type {
826            LinearSolverType::SparseCholesky => Box::new(SparseCholeskySolver::new()),
827            LinearSolverType::SparseQR => Box::new(SparseQRSolver::new()),
828        }
829    }
830
831    /// Update damping parameter based on step quality using trust region approach
832    /// Reference: Introduction to Optimization and Data Fitting
833    /// Algorithm 6.18
834    fn update_damping(&mut self, rho: f64) -> bool {
835        if rho > 0.0 {
836            // Step accepted - decrease damping
837            let coff = 2.0 * rho - 1.0;
838            self.config.damping *= (1.0_f64 / 3.0).max(1.0 - coff * coff * coff);
839            self.config.damping = self.config.damping.max(self.config.damping_min);
840            self.config.damping_nu = 2.0;
841            true
842        } else {
843            // Step rejected - increase damping
844            self.config.damping *= self.config.damping_nu;
845            self.config.damping_nu *= 2.0;
846            self.config.damping = self.config.damping.min(self.config.damping_max);
847            false
848        }
849    }
850
851    /// Compute step quality ratio (actual vs predicted reduction)
852    /// Reference: Introduction to Optimization and Data Fitting
853    /// Reference: Damping parameter in marquardt's method
854    /// Formula 2.2
855    fn compute_step_quality(
856        &self,
857        current_cost: f64,
858        new_cost: f64,
859        predicted_reduction: f64,
860    ) -> f64 {
861        let actual_reduction = current_cost - new_cost;
862        if predicted_reduction.abs() < 1e-15 {
863            if actual_reduction > 0.0 { 1.0 } else { 0.0 }
864        } else {
865            actual_reduction / predicted_reduction
866        }
867    }
868
869    /// Compute predicted cost reduction from linear model
870    /// Standard LM formula: 0.5 * step^T * (damping * step - gradient)
871    fn compute_predicted_reduction(&self, step: &Mat<f64>, gradient: &Mat<f64>) -> f64 {
872        // Standard Levenberg-Marquardt predicted reduction formula
873        // predicted_reduction = -step^T * gradient - 0.5 * step^T * H * step
874        //                     = 0.5 * step^T * (damping * step - gradient)
875        let diff = self.config.damping * step - gradient;
876        (0.5 * step.transpose() * &diff)[(0, 0)]
877    }
878
879    /// Check convergence criteria
880    /// Check convergence using comprehensive termination criteria.
881    ///
882    /// Implements 9 termination criteria following Ceres Solver standards:
883    ///
884    /// 1. **Gradient Norm (First-Order Optimality)**: ||g||∞ ≤ gradient_tolerance
885    /// 2. **Parameter Change Tolerance**: ||h|| ≤ parameter_tolerance * (||x|| + parameter_tolerance)
886    /// 3. **Function Value Change Tolerance**: |ΔF| < cost_tolerance * F
887    /// 4. **Objective Function Cutoff**: F_new < min_cost_threshold (optional)
888    /// 5. **Trust Region Radius**: radius < min_trust_region_radius
889    /// 6. **Singular/Ill-Conditioned Jacobian**: Detected during linear solve
890    /// 7. **Invalid Numerical Values**: NaN or Inf in cost or parameters
891    /// 8. **Maximum Iterations**: iteration >= max_iterations
892    /// 9. **Timeout**: elapsed >= timeout
893    ///
894    /// # Arguments
895    ///
896    /// * `iteration` - Current iteration number
897    /// * `current_cost` - Cost before applying the step
898    /// * `new_cost` - Cost after applying the step
899    /// * `parameter_norm` - L2 norm of current parameter vector ||x||
900    /// * `parameter_update_norm` - L2 norm of parameter update step ||h||
901    /// * `gradient_norm` - Infinity norm of gradient ||g||∞
902    /// * `trust_region_radius` - Current trust region radius
903    /// * `elapsed` - Elapsed time since optimization start
904    /// * `step_accepted` - Whether the current step was accepted
905    ///
906    /// # Returns
907    ///
908    /// `Some(OptimizationStatus)` if any termination criterion is satisfied, `None` otherwise.
909    #[allow(clippy::too_many_arguments)]
910    fn check_convergence(
911        &self,
912        iteration: usize,
913        current_cost: f64,
914        new_cost: f64,
915        parameter_norm: f64,
916        parameter_update_norm: f64,
917        gradient_norm: f64,
918        trust_region_radius: f64,
919        elapsed: Duration,
920        step_accepted: bool,
921    ) -> Option<OptimizationStatus> {
922        // CRITICAL SAFETY CHECKS (perform first, before convergence checks)
923
924        // CRITERION 7: Invalid Numerical Values (NaN/Inf)
925        // Always check for numerical instability first
926        if !new_cost.is_finite() || !parameter_update_norm.is_finite() || !gradient_norm.is_finite()
927        {
928            return Some(OptimizationStatus::InvalidNumericalValues);
929        }
930
931        // CRITERION 9: Timeout
932        // Check wall-clock time limit
933        if let Some(timeout) = self.config.timeout
934            && elapsed >= timeout
935        {
936            return Some(OptimizationStatus::Timeout);
937        }
938
939        // CRITERION 8: Maximum Iterations
940        // Check iteration count limit
941        if iteration >= self.config.max_iterations {
942            return Some(OptimizationStatus::MaxIterationsReached);
943        }
944        // CONVERGENCE CRITERIA (only check after successful steps)
945
946        // Only check convergence criteria after accepted steps
947        // (rejected steps don't indicate convergence)
948        if !step_accepted {
949            return None;
950        }
951
952        // CRITERION 1: Gradient Norm (First-Order Optimality)
953        // Check if gradient infinity norm is below threshold
954        // This indicates we're at a critical point (local minimum, saddle, or maximum)
955        if gradient_norm < self.config.gradient_tolerance {
956            return Some(OptimizationStatus::GradientToleranceReached);
957        }
958
959        // Only check parameter and cost criteria after first iteration
960        if iteration > 0 {
961            // CRITERION 2: Parameter Change Tolerance (xtol)
962            // Ceres formula: ||h|| ≤ ε_param * (||x|| + ε_param)
963            // This is a relative measure that scales with parameter magnitude
964            let relative_step_tolerance = self.config.parameter_tolerance
965                * (parameter_norm + self.config.parameter_tolerance);
966
967            if parameter_update_norm <= relative_step_tolerance {
968                return Some(OptimizationStatus::ParameterToleranceReached);
969            }
970
971            // CRITERION 3: Function Value Change Tolerance (ftol)
972            // Ceres formula: |ΔF| < ε_cost * F
973            // Check relative cost change (not absolute)
974            let cost_change = (current_cost - new_cost).abs();
975            let relative_cost_change = cost_change / current_cost.max(1e-10); // Avoid division by zero
976
977            if relative_cost_change < self.config.cost_tolerance {
978                return Some(OptimizationStatus::CostToleranceReached);
979            }
980        }
981
982        // CRITERION 4: Objective Function Cutoff (optional early stopping)
983        // Useful for "good enough" solutions
984        if let Some(min_cost) = self.config.min_cost_threshold
985            && new_cost < min_cost
986        {
987            return Some(OptimizationStatus::MinCostThresholdReached);
988        }
989
990        // CRITERION 5: Trust Region Radius
991        // If trust region has collapsed, optimization has converged or problem is ill-conditioned
992        if trust_region_radius < self.config.min_trust_region_radius {
993            return Some(OptimizationStatus::TrustRegionRadiusTooSmall);
994        }
995
996        // CRITERION 6: Singular/Ill-Conditioned Jacobian
997        // Note: This is typically detected during the linear solve and handled there
998        // The max_condition_number check would be expensive to compute here
999        // If linear solve fails, it returns an error that's converted to NumericalFailure
1000
1001        // No termination criterion satisfied
1002        None
1003    }
1004
1005    /// Compute total parameter vector norm ||x||.
1006    ///
1007    /// Computes the L2 norm of all parameter vectors concatenated together.
1008    /// This is used in the relative parameter tolerance check.
1009    ///
1010    /// # Arguments
1011    ///
1012    /// * `variables` - Map of variable names to their current values
1013    ///
1014    /// # Returns
1015    ///
1016    /// The L2 norm of the concatenated parameter vector
1017    fn compute_parameter_norm(variables: &HashMap<String, VariableEnum>) -> f64 {
1018        variables
1019            .values()
1020            .map(|v| {
1021                let vec = v.to_vector();
1022                vec.norm_squared()
1023            })
1024            .sum::<f64>()
1025            .sqrt()
1026    }
1027
1028    /// Create Jacobi scaling matrix from Jacobian
1029    fn create_jacobi_scaling(
1030        &self,
1031        jacobian: &SparseColMat<usize, f64>,
1032    ) -> Result<SparseColMat<usize, f64>, OptimizerError> {
1033        let cols = jacobian.ncols();
1034        let jacobi_scaling_vec: Vec<Triplet<usize, usize, f64>> = (0..cols)
1035            .map(|c| {
1036                // Compute column norm: sqrt(sum(J_col^2))
1037                let col_norm_squared: f64 = jacobian
1038                    .triplet_iter()
1039                    .filter(|t| t.col == c)
1040                    .map(|t| t.val * t.val)
1041                    .sum();
1042                let col_norm = col_norm_squared.sqrt();
1043                // Scaling factor: 1.0 / (1.0 + col_norm)
1044                let scaling = 1.0 / (1.0 + col_norm);
1045                Triplet::new(c, c, scaling)
1046            })
1047            .collect();
1048
1049        SparseColMat::try_new_from_triplets(cols, cols, &jacobi_scaling_vec)
1050            .map_err(|e| OptimizerError::JacobiScalingCreation(e.to_string()).log_with_source(e))
1051    }
1052
1053    /// Initialize optimization state from problem and initial parameters
1054    fn initialize_optimization_state(
1055        &self,
1056        problem: &Problem,
1057        initial_params: &HashMap<String, (ManifoldType, DVector<f64>)>,
1058    ) -> Result<LinearizerResult, error::ApexSolverError> {
1059        // Initialize variables from initial values
1060        let variables = problem.initialize_variables(initial_params);
1061
1062        // Create column mapping for variables
1063        let mut variable_index_map = HashMap::new();
1064        let mut col_offset = 0;
1065        let mut sorted_vars: Vec<String> = variables.keys().cloned().collect();
1066        sorted_vars.sort();
1067
1068        for var_name in &sorted_vars {
1069            variable_index_map.insert(var_name.clone(), col_offset);
1070            col_offset += variables[var_name].get_size();
1071        }
1072
1073        // Build symbolic structure for sparse operations
1074        let symbolic_structure =
1075            problem.build_symbolic_structure(&variables, &variable_index_map, col_offset)?;
1076
1077        // Initial cost evaluation (residual only, no Jacobian needed)
1078        let residual = problem.compute_residual_sparse(&variables)?;
1079        let current_cost = compute_cost(&residual);
1080        let initial_cost = current_cost;
1081
1082        Ok(LinearizerResult {
1083            variables,
1084            variable_index_map,
1085            sorted_vars,
1086            symbolic_structure,
1087            current_cost,
1088            initial_cost,
1089        })
1090    }
1091
1092    /// Process Jacobian by creating and applying Jacobi scaling if enabled
1093    fn process_jacobian(
1094        &mut self,
1095        jacobian: &SparseColMat<usize, f64>,
1096        iteration: usize,
1097    ) -> Result<SparseColMat<usize, f64>, OptimizerError> {
1098        // Create Jacobi scaling on first iteration if enabled
1099        if iteration == 0 {
1100            let scaling = self.create_jacobi_scaling(jacobian)?;
1101            self.jacobi_scaling = Some(scaling);
1102        }
1103        let scaling = self
1104            .jacobi_scaling
1105            .as_ref()
1106            .ok_or_else(|| OptimizerError::JacobiScalingNotInitialized.log())?;
1107        Ok(jacobian * scaling)
1108    }
1109
1110    /// Compute optimization step by solving the augmented system
1111    fn compute_levenberg_marquardt_step(
1112        &self,
1113        residuals: &Mat<f64>,
1114        scaled_jacobian: &SparseColMat<usize, f64>,
1115        linear_solver: &mut Box<dyn SparseLinearSolver>,
1116    ) -> Result<StepResult, OptimizerError> {
1117        // Solve augmented equation: (J_scaled^T * J_scaled + λI) * dx_scaled = -J_scaled^T * r
1118        let residuals_owned = residuals.as_ref().to_owned();
1119        let scaled_step = linear_solver
1120            .solve_augmented_equation(&residuals_owned, scaled_jacobian, self.config.damping)
1121            .map_err(|e| OptimizerError::LinearSolveFailed(e.to_string()).log_with_source(e))?;
1122
1123        // Get cached gradient and Hessian from the solver
1124        let gradient = linear_solver.get_gradient().ok_or_else(|| {
1125            OptimizerError::NumericalInstability("Gradient not available".into()).log()
1126        })?;
1127        let _hessian = linear_solver.get_hessian().ok_or_else(|| {
1128            OptimizerError::NumericalInstability("Hessian not available".into()).log()
1129        })?;
1130        let gradient_norm = gradient.norm_l2();
1131
1132        // Apply inverse Jacobi scaling to get final step (if enabled)
1133        let step = if self.config.use_jacobi_scaling {
1134            let scaling = self
1135                .jacobi_scaling
1136                .as_ref()
1137                .ok_or_else(|| OptimizerError::JacobiScalingNotInitialized.log())?;
1138            &scaled_step * scaling
1139        } else {
1140            scaled_step
1141        };
1142
1143        // Compute predicted reduction using scaled values
1144        let predicted_reduction = self.compute_predicted_reduction(&step, gradient);
1145
1146        Ok(StepResult {
1147            step,
1148            gradient_norm,
1149            predicted_reduction,
1150        })
1151    }
1152
1153    /// Evaluate and apply step, handling acceptance/rejection based on step quality
1154    fn evaluate_and_apply_step(
1155        &mut self,
1156        step_result: &StepResult,
1157        state: &mut LinearizerResult,
1158        problem: &Problem,
1159    ) -> error::ApexSolverResult<StepEvaluation> {
1160        // Apply parameter updates using manifold operations
1161        let _step_norm = apply_parameter_step(
1162            &mut state.variables,
1163            step_result.step.as_ref(),
1164            &state.sorted_vars,
1165        );
1166
1167        // Compute new cost (residual only, no Jacobian needed for step evaluation)
1168        let new_residual = problem.compute_residual_sparse(&state.variables)?;
1169        let new_cost = compute_cost(&new_residual);
1170
1171        // Compute step quality
1172        let rho = self.compute_step_quality(
1173            state.current_cost,
1174            new_cost,
1175            step_result.predicted_reduction,
1176        );
1177
1178        // Update damping and decide whether to accept step
1179        let accepted = self.update_damping(rho);
1180
1181        let cost_reduction = if accepted {
1182            // Accept the step - parameters already updated
1183            let reduction = state.current_cost - new_cost;
1184            state.current_cost = new_cost;
1185            reduction
1186        } else {
1187            // Reject the step - revert parameter changes
1188            apply_negative_parameter_step(
1189                &mut state.variables,
1190                step_result.step.as_ref(),
1191                &state.sorted_vars,
1192            );
1193            0.0
1194        };
1195
1196        Ok(StepEvaluation {
1197            accepted,
1198            cost_reduction,
1199            rho,
1200        })
1201    }
1202
1203    /// Create optimization summary
1204    #[allow(clippy::too_many_arguments)]
1205    fn create_summary(
1206        &self,
1207        initial_cost: f64,
1208        final_cost: f64,
1209        rho: f64,
1210        iterations: usize,
1211        successful_steps: usize,
1212        unsuccessful_steps: usize,
1213        max_gradient_norm: f64,
1214        final_gradient_norm: f64,
1215        max_parameter_update_norm: f64,
1216        final_parameter_update_norm: f64,
1217        total_cost_reduction: f64,
1218        total_time: Duration,
1219        iteration_history: Vec<IterationStats>,
1220        convergence_status: OptimizationStatus,
1221    ) -> LevenbergMarquardtSummary {
1222        LevenbergMarquardtSummary {
1223            initial_cost,
1224            final_cost,
1225            rho,
1226            iterations,
1227            successful_steps,
1228            unsuccessful_steps,
1229            final_damping: self.config.damping,
1230            average_cost_reduction: if iterations > 0 {
1231                total_cost_reduction / iterations as f64
1232            } else {
1233                0.0
1234            },
1235            max_gradient_norm,
1236            final_gradient_norm,
1237            max_parameter_update_norm,
1238            final_parameter_update_norm,
1239            total_time,
1240            average_time_per_iteration: if iterations > 0 {
1241                total_time / iterations as u32
1242            } else {
1243                Duration::from_secs(0)
1244            },
1245            iteration_history,
1246            convergence_status,
1247        }
1248    }
1249
1250    pub fn optimize(
1251        &mut self,
1252        problem: &Problem,
1253        initial_params: &HashMap<String, (ManifoldType, DVector<f64>)>,
1254    ) -> Result<SolverResult<HashMap<String, VariableEnum>>, error::ApexSolverError> {
1255        let start_time = Instant::now();
1256        let mut iteration = 0;
1257        let mut cost_evaluations = 1; // Initial cost evaluation
1258        let mut jacobian_evaluations = 0;
1259        let mut successful_steps = 0;
1260        let mut unsuccessful_steps = 0;
1261
1262        // Initialize optimization state
1263        let mut state = self.initialize_optimization_state(problem, initial_params)?;
1264
1265        // Create linear solver
1266        let mut linear_solver = self.create_linear_solver();
1267
1268        // Initialize summary tracking variables
1269        let mut max_gradient_norm: f64 = 0.0;
1270        let mut max_parameter_update_norm: f64 = 0.0;
1271        let mut total_cost_reduction = 0.0;
1272        let mut final_gradient_norm;
1273        let mut final_parameter_update_norm;
1274
1275        // Initialize iteration statistics tracking
1276        let mut iteration_stats = Vec::with_capacity(self.config.max_iterations);
1277        let mut previous_cost = state.current_cost;
1278
1279        // Print configuration and header if debug level is enabled
1280        if tracing::enabled!(tracing::Level::DEBUG) {
1281            self.config.print_configuration();
1282            IterationStats::print_header();
1283        }
1284
1285        // Main optimization loop
1286        loop {
1287            let iter_start = Instant::now();
1288            // Evaluate residuals and Jacobian
1289            let (residuals, jacobian) = problem.compute_residual_and_jacobian_sparse(
1290                &state.variables,
1291                &state.variable_index_map,
1292                &state.symbolic_structure,
1293            )?;
1294            jacobian_evaluations += 1;
1295
1296            // Process Jacobian (apply scaling if enabled)
1297            let scaled_jacobian = if self.config.use_jacobi_scaling {
1298                self.process_jacobian(&jacobian, iteration)?
1299            } else {
1300                jacobian
1301            };
1302
1303            // Compute optimization step
1304            let step_result = self.compute_levenberg_marquardt_step(
1305                &residuals,
1306                &scaled_jacobian,
1307                &mut linear_solver,
1308            )?;
1309
1310            // Update tracking variables
1311            max_gradient_norm = max_gradient_norm.max(step_result.gradient_norm);
1312            final_gradient_norm = step_result.gradient_norm;
1313            let step_norm = step_result.step.norm_l2();
1314            max_parameter_update_norm = max_parameter_update_norm.max(step_norm);
1315            final_parameter_update_norm = step_norm;
1316
1317            // Evaluate and apply step (handles accept/reject)
1318            let step_eval = self.evaluate_and_apply_step(&step_result, &mut state, problem)?;
1319            cost_evaluations += 1;
1320
1321            // Update counters based on acceptance
1322            if step_eval.accepted {
1323                successful_steps += 1;
1324                total_cost_reduction += step_eval.cost_reduction;
1325            } else {
1326                unsuccessful_steps += 1;
1327            }
1328
1329            // OPTIMIZATION: Only collect iteration statistics if debug level is enabled
1330            // This eliminates ~2-5ms overhead per iteration for non-debug optimization
1331            if tracing::enabled!(tracing::Level::DEBUG) {
1332                let iter_elapsed_ms = iter_start.elapsed().as_secs_f64() * 1000.0;
1333                let total_elapsed_ms = start_time.elapsed().as_secs_f64() * 1000.0;
1334
1335                let stats = IterationStats {
1336                    iteration,
1337                    cost: state.current_cost,
1338                    cost_change: previous_cost - state.current_cost,
1339                    gradient_norm: step_result.gradient_norm,
1340                    step_norm,
1341                    tr_ratio: step_eval.rho,
1342                    tr_radius: self.config.damping,
1343                    ls_iter: 0, // Direct solver (Cholesky) has no iterations
1344                    iter_time_ms: iter_elapsed_ms,
1345                    total_time_ms: total_elapsed_ms,
1346                    accepted: step_eval.accepted,
1347                };
1348
1349                iteration_stats.push(stats.clone());
1350                stats.print_line();
1351            }
1352
1353            previous_cost = state.current_cost;
1354
1355            // Notify all observers with current state
1356            // First set metrics data, then notify observers
1357            self.observers.set_iteration_metrics(
1358                state.current_cost,
1359                step_result.gradient_norm,
1360                Some(self.config.damping),
1361                step_norm,
1362                Some(step_eval.rho),
1363            );
1364
1365            // Set matrix data if available and there are observers
1366            if !self.observers.is_empty()
1367                && let (Some(hessian), Some(gradient)) =
1368                    (linear_solver.get_hessian(), linear_solver.get_gradient())
1369            {
1370                self.observers
1371                    .set_matrix_data(Some(hessian.clone()), Some(gradient.clone()));
1372            }
1373
1374            // Notify observers with current variable values and iteration number
1375            self.observers.notify(&state.variables, iteration);
1376
1377            // Check convergence
1378            let elapsed = start_time.elapsed();
1379
1380            // Compute parameter norm for relative parameter tolerance check
1381            let parameter_norm = Self::compute_parameter_norm(&state.variables);
1382
1383            // Compute new cost for convergence check (state may already have new cost if step accepted)
1384            let new_cost = if step_eval.accepted {
1385                state.current_cost
1386            } else {
1387                // Use cost before step application
1388                state.current_cost
1389            };
1390
1391            // Cost before this step (need to add back reduction if step was accepted)
1392            let cost_before_step = if step_eval.accepted {
1393                state.current_cost + step_eval.cost_reduction
1394            } else {
1395                state.current_cost
1396            };
1397
1398            if let Some(status) = self.check_convergence(
1399                iteration,
1400                cost_before_step,
1401                new_cost,
1402                parameter_norm,
1403                step_norm,
1404                step_result.gradient_norm,
1405                self.config.trust_region_radius,
1406                elapsed,
1407                step_eval.accepted,
1408            ) {
1409                let summary = self.create_summary(
1410                    state.initial_cost,
1411                    state.current_cost,
1412                    step_eval.rho,
1413                    iteration + 1,
1414                    successful_steps,
1415                    unsuccessful_steps,
1416                    max_gradient_norm,
1417                    final_gradient_norm,
1418                    max_parameter_update_norm,
1419                    final_parameter_update_norm,
1420                    total_cost_reduction,
1421                    elapsed,
1422                    iteration_stats.clone(),
1423                    status.clone(),
1424                );
1425
1426                // Print summary only if debug level is enabled
1427                if tracing::enabled!(tracing::Level::DEBUG) {
1428                    debug!("{}", summary);
1429                }
1430
1431                // Compute covariances if enabled
1432                let covariances = if self.config.compute_covariances {
1433                    problem.compute_and_set_covariances(
1434                        &mut linear_solver,
1435                        &mut state.variables,
1436                        &state.variable_index_map,
1437                    )
1438                } else {
1439                    None
1440                };
1441
1442                return Ok(SolverResult {
1443                    status,
1444                    iterations: iteration + 1,
1445                    initial_cost: state.initial_cost,
1446                    final_cost: state.current_cost,
1447                    parameters: state.variables.into_iter().collect(),
1448                    elapsed_time: elapsed,
1449                    convergence_info: Some(ConvergenceInfo {
1450                        final_gradient_norm,
1451                        final_parameter_update_norm,
1452                        cost_evaluations,
1453                        jacobian_evaluations,
1454                    }),
1455                    covariances,
1456                });
1457            }
1458
1459            // Note: Max iterations and timeout checks are now handled inside check_convergence()
1460
1461            iteration += 1;
1462        }
1463    }
1464}
1465// Implement Solver trait
1466impl Solver for LevenbergMarquardt {
1467    type Config = LevenbergMarquardtConfig;
1468    type Error = error::ApexSolverError;
1469
1470    fn new() -> Self {
1471        Self::default()
1472    }
1473
1474    fn optimize(
1475        &mut self,
1476        problem: &Problem,
1477        initial_params: &HashMap<String, (ManifoldType, DVector<f64>)>,
1478    ) -> Result<SolverResult<HashMap<String, VariableEnum>>, Self::Error> {
1479        self.optimize(problem, initial_params)
1480    }
1481}
1482
1483#[cfg(test)]
1484mod tests {
1485    use super::*;
1486    use crate::factors::Factor;
1487    use nalgebra::{DMatrix, dvector};
1488    /// Custom Rosenbrock Factor 1: r1 = 10(x2 - x1²)
1489    /// Demonstrates extensibility - custom factors can be defined outside of factors.rs
1490    #[derive(Debug, Clone)]
1491    struct RosenbrockFactor1;
1492
1493    impl Factor for RosenbrockFactor1 {
1494        fn linearize(
1495            &self,
1496            params: &[DVector<f64>],
1497            compute_jacobian: bool,
1498        ) -> (DVector<f64>, Option<DMatrix<f64>>) {
1499            let x1 = params[0][0];
1500            let x2 = params[1][0];
1501
1502            // Residual: r1 = 10(x2 - x1²)
1503            let residual = dvector![10.0 * (x2 - x1 * x1)];
1504
1505            let jacobian = if compute_jacobian {
1506                // Jacobian: ∂r1/∂x1 = -20*x1, ∂r1/∂x2 = 10
1507                let mut jacobian = DMatrix::zeros(1, 2);
1508                jacobian[(0, 0)] = -20.0 * x1;
1509                jacobian[(0, 1)] = 10.0;
1510
1511                Some(jacobian)
1512            } else {
1513                None
1514            };
1515
1516            (residual, jacobian)
1517        }
1518
1519        fn get_dimension(&self) -> usize {
1520            1
1521        }
1522    }
1523
1524    /// Custom Rosenbrock Factor 2: r2 = 1 - x1
1525    /// Demonstrates extensibility - custom factors can be defined outside of factors.rs
1526    #[derive(Debug, Clone)]
1527    struct RosenbrockFactor2;
1528
1529    impl Factor for RosenbrockFactor2 {
1530        fn linearize(
1531            &self,
1532            params: &[DVector<f64>],
1533            compute_jacobian: bool,
1534        ) -> (DVector<f64>, Option<DMatrix<f64>>) {
1535            let x1 = params[0][0];
1536
1537            // Residual: r2 = 1 - x1
1538            let residual = dvector![1.0 - x1];
1539
1540            let jacobian = if compute_jacobian {
1541                // Jacobian: ∂r2/∂x1 = -1
1542                Some(DMatrix::from_element(1, 1, -1.0))
1543            } else {
1544                None
1545            };
1546
1547            (residual, jacobian)
1548        }
1549
1550        fn get_dimension(&self) -> usize {
1551            1
1552        }
1553    }
1554
1555    #[test]
1556    fn test_rosenbrock_optimization() -> Result<(), Box<dyn std::error::Error>> {
1557        // Rosenbrock function test:
1558        // Minimize: r1² + r2² where
1559        //   r1 = 10(x2 - x1²)
1560        //   r2 = 1 - x1
1561        // Starting point: [-1.2, 1.0]
1562        // Expected minimum: [1.0, 1.0]
1563
1564        let mut problem = Problem::new();
1565        let mut initial_values = HashMap::new();
1566
1567        // Add variables using Rn manifold (Euclidean space)
1568        initial_values.insert("x1".to_string(), (ManifoldType::RN, dvector![-1.2]));
1569        initial_values.insert("x2".to_string(), (ManifoldType::RN, dvector![1.0]));
1570
1571        // Add custom factors (demonstrates extensibility!)
1572        problem.add_residual_block(&["x1", "x2"], Box::new(RosenbrockFactor1), None);
1573        problem.add_residual_block(&["x1"], Box::new(RosenbrockFactor2), None);
1574
1575        // Configure Levenberg-Marquardt optimizer
1576        let config = LevenbergMarquardtConfig::new()
1577            .with_max_iterations(100)
1578            .with_cost_tolerance(1e-8)
1579            .with_parameter_tolerance(1e-8)
1580            .with_gradient_tolerance(1e-10);
1581
1582        let mut solver = LevenbergMarquardt::with_config(config);
1583        let result = solver.optimize(&problem, &initial_values)?;
1584
1585        // Extract final values
1586        let x1_final = result
1587            .parameters
1588            .get("x1")
1589            .ok_or("x1 not found")?
1590            .to_vector()[0];
1591        let x2_final = result
1592            .parameters
1593            .get("x2")
1594            .ok_or("x2 not found")?
1595            .to_vector()[0];
1596
1597        // Verify convergence to [1.0, 1.0]
1598        assert!(
1599            matches!(
1600                result.status,
1601                OptimizationStatus::Converged
1602                    | OptimizationStatus::CostToleranceReached
1603                    | OptimizationStatus::ParameterToleranceReached
1604                    | OptimizationStatus::GradientToleranceReached
1605            ),
1606            "Optimization should converge"
1607        );
1608        assert!(
1609            (x1_final - 1.0).abs() < 1e-4,
1610            "x1 should converge to 1.0, got {}",
1611            x1_final
1612        );
1613        assert!(
1614            (x2_final - 1.0).abs() < 1e-4,
1615            "x2 should converge to 1.0, got {}",
1616            x2_final
1617        );
1618        assert!(
1619            result.final_cost < 1e-6,
1620            "Final cost should be near zero, got {}",
1621            result.final_cost
1622        );
1623        Ok(())
1624    }
1625}