apex_solver/optimizer/
levenberg_marquardt.rs

1//! Levenberg-Marquardt algorithm implementation.
2//!
3//! The Levenberg-Marquardt (LM) method is a robust and widely-used algorithm for solving
4//! nonlinear least squares problems of the form:
5//!
6//! ```text
7//! min f(x) = ½||r(x)||² = ½Σᵢ rᵢ(x)²
8//! ```
9//!
10//! where `r: ℝⁿ → ℝᵐ` is the residual vector function.
11//!
12//! # Algorithm Overview
13//!
14//! The Levenberg-Marquardt method solves the damped normal equations at each iteration:
15//!
16//! ```text
17//! (J^T·J + λI)·h = -J^T·r
18//! ```
19//!
20//! where:
21//! - `J` is the Jacobian matrix (m × n)
22//! - `r` is the residual vector (m × 1)
23//! - `h` is the step vector (n × 1)
24//! - `λ` is the adaptive damping parameter (scalar)
25//! - `I` is the identity matrix (or diagonal scaling matrix)
26//!
27//! ## Damping Parameter Strategy
28//!
29//! The damping parameter λ adapts based on step quality:
30//!
31//! - **λ → 0** (small damping): Behaves like Gauss-Newton with fast quadratic convergence
32//! - **λ → ∞** (large damping): Behaves like gradient descent with guaranteed descent direction
33//!
34//! This interpolation between Newton and gradient descent provides excellent robustness
35//! while maintaining fast convergence near the solution.
36//!
37//! ## Step Acceptance and Damping Update
38//!
39//! The algorithm evaluates each proposed step using the gain ratio:
40//!
41//! ```text
42//! ρ = (actual reduction) / (predicted reduction)
43//!   = [f(xₖ) - f(xₖ + h)] / [f(xₖ) - L(h)]
44//! ```
45//!
46//! where `L(h) = f(xₖ) + h^T·g + ½h^T·H·h` is the local quadratic model.
47//!
48//! **Step acceptance:**
49//! - If `ρ > 0`: Accept step (cost decreased), decrease λ to trust the model more
50//! - If `ρ ≤ 0`: Reject step (cost increased), increase λ to be more conservative
51//!
52//! **Damping update** (Nielsen's formula):
53//! ```text
54//! λₖ₊₁ = λₖ · max(1/3, 1 - (2ρ - 1)³)
55//! ```
56//!
57//! This provides smooth, data-driven adaptation of the damping parameter.
58//!
59//! ## Convergence Properties
60//!
61//! - **Global convergence**: Guaranteed to find a stationary point from any starting guess
62//! - **Local quadratic convergence**: Near the solution, behaves like Gauss-Newton
63//! - **Robust to poor initialization**: Adaptive damping prevents divergence
64//! - **Handles ill-conditioning**: Large λ stabilizes nearly singular Hessian
65//!
66//! ## When to Use
67//!
68//! Levenberg-Marquardt is the best general-purpose choice when:
69//! - Initial parameter guess may be far from the optimum
70//! - Problem conditioning is unknown
71//! - Robustness is prioritized over raw speed
72//! - You want reliable convergence across diverse problem types
73//!
74//! For problems with specific structure, consider:
75//! - [`GaussNewton`](crate::GaussNewton) if well-conditioned with good initialization
76//! - [`DogLeg`](crate::DogLeg) for explicit trust region control
77//!
78//! # Implementation Features
79//!
80//! - **Sparse matrix support**: Efficient handling of large-scale problems via `faer` sparse library
81//! - **Adaptive damping**: Nielsen's formula for smooth parameter adaptation
82//! - **Robust linear solvers**: Cholesky (fast) or QR (stable) factorization
83//! - **Jacobi scaling**: Optional diagonal preconditioning for mixed-scale problems
84//! - **Covariance computation**: Optional uncertainty quantification after convergence
85//! - **Manifold operations**: Native support for optimization on Lie groups (SE2, SE3, SO2, SO3)
86//! - **Comprehensive diagnostics**: Detailed summaries of convergence and performance
87//!
88//! # Mathematical Background
89//!
90//! The augmented Hessian `J^T·J + λI` combines two beneficial properties:
91//!
92//! 1. **Positive definiteness**: Always solvable even when `J^T·J` is singular
93//! 2. **Regularization**: Prevents taking steps in poorly-determined directions
94//!
95//! The trust region interpretation: λ controls an implicit spherical trust region where
96//! larger λ restricts step size, ensuring the linear model remains valid.
97//!
98//! # Examples
99//!
100//! ## Basic usage
101//!
102//! ```no_run
103//! use apex_solver::LevenbergMarquardt;
104//! use apex_solver::core::problem::Problem;
105//! use std::collections::HashMap;
106//!
107//! # fn main() -> Result<(), Box<dyn std::error::Error>> {
108//! let mut problem = Problem::new();
109//! // ... add residual blocks (factors) to problem ...
110//!
111//! let initial_values = HashMap::new();
112//! // ... initialize parameters ...
113//!
114//! let mut solver = LevenbergMarquardt::new();
115//! let result = solver.optimize(&problem, &initial_values)?;
116//!
117//! # Ok(())
118//! # }
119//! ```
120//!
121//! ## Advanced configuration
122//!
123//! ```no_run
124//! use apex_solver::optimizer::levenberg_marquardt::{LevenbergMarquardtConfig, LevenbergMarquardt};
125//! use apex_solver::linalg::LinearSolverType;
126//!
127//! # fn main() {
128//! let config = LevenbergMarquardtConfig::new()
129//!     .with_max_iterations(100)
130//!     .with_cost_tolerance(1e-6)
131//!     .with_damping(1e-3)  // Initial damping
132//!     .with_damping_bounds(1e-12, 1e12)  // Min/max damping
133//!     .with_jacobi_scaling(true);  // Improve conditioning
134//!
135//! let mut solver = LevenbergMarquardt::with_config(config);
136//! # }
137//! ```
138//!
139//! # References
140//!
141//! - Levenberg, K. (1944). "A Method for the Solution of Certain Non-Linear Problems in Least Squares". *Quarterly of Applied Mathematics*.
142//! - Marquardt, D. W. (1963). "An Algorithm for Least-Squares Estimation of Nonlinear Parameters". *Journal of the Society for Industrial and Applied Mathematics*.
143//! - Madsen, K., Nielsen, H. B., & Tingleff, O. (2004). *Methods for Non-Linear Least Squares Problems* (2nd ed.). Chapter 3.
144//! - Nocedal, J. & Wright, S. (2006). *Numerical Optimization* (2nd ed.). Springer. Chapter 10.
145//! - Nielsen, H. B. (1999). "Damping Parameter in Marquardt's Method". Technical Report IMM-REP-1999-05.
146
147use crate::core::problem::{Problem, SymbolicStructure, VariableEnum};
148use crate::error;
149use crate::linalg::{
150    LinearSolverType, SchurPreconditioner, SchurSolverAdapter, SchurVariant, SparseCholeskySolver,
151    SparseLinearSolver, SparseQRSolver,
152};
153use crate::manifold::ManifoldType;
154use crate::optimizer::{
155    ConvergenceInfo, OptObserverVec, OptimizationStatus, OptimizerError, Solver, SolverResult,
156    apply_negative_parameter_step, apply_parameter_step, compute_cost,
157};
158
159use faer::{
160    Mat,
161    sparse::{SparseColMat, Triplet},
162};
163use nalgebra::DVector;
164use std::collections::HashMap;
165use std::{
166    fmt,
167    fmt::{Display, Formatter},
168    time::{Duration, Instant},
169};
170use tracing::debug;
171
172/// Summary statistics for the Levenberg-Marquardt optimization process.
173#[derive(Debug, Clone)]
174pub struct LevenbergMarquardtSummary {
175    /// Initial cost value
176    pub initial_cost: f64,
177    /// Final cost value
178    pub final_cost: f64,
179    /// Ratio of actual to predicted reduction in cost
180    pub rho: f64,
181    /// Total number of iterations performed
182    pub iterations: usize,
183    /// Number of successful steps (cost decreased)
184    pub successful_steps: usize,
185    /// Number of unsuccessful steps (cost increased, damping increased)
186    pub unsuccessful_steps: usize,
187    /// Final damping parameter value
188    pub final_damping: f64,
189    /// Average cost reduction per iteration
190    pub average_cost_reduction: f64,
191    /// Maximum gradient norm encountered
192    pub max_gradient_norm: f64,
193    /// Final gradient norm
194    pub final_gradient_norm: f64,
195    /// Maximum parameter update norm
196    pub max_parameter_update_norm: f64,
197    /// Final parameter update norm
198    pub final_parameter_update_norm: f64,
199    /// Total time elapsed
200    pub total_time: Duration,
201    /// Average time per iteration
202    pub average_time_per_iteration: Duration,
203    /// Detailed per-iteration statistics history
204    pub iteration_history: Vec<IterationStats>,
205    /// Convergence status
206    pub convergence_status: OptimizationStatus,
207}
208
209impl Display for LevenbergMarquardtSummary {
210    fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
211        // Determine if converged
212        let converged = matches!(
213            self.convergence_status,
214            OptimizationStatus::Converged
215                | OptimizationStatus::CostToleranceReached
216                | OptimizationStatus::GradientToleranceReached
217                | OptimizationStatus::ParameterToleranceReached
218        );
219
220        writeln!(f, "Levenberg-Marquardt Final Result")?;
221
222        // Title with convergence status
223        if converged {
224            writeln!(f, "CONVERGED ({:?})", self.convergence_status)?;
225        } else {
226            writeln!(f, "DIVERGED ({:?})", self.convergence_status)?;
227        }
228
229        writeln!(f)?;
230        writeln!(f, "Cost:")?;
231        writeln!(f, "  Initial:   {:.6e}", self.initial_cost)?;
232        writeln!(f, "  Final:     {:.6e}", self.final_cost)?;
233        writeln!(
234            f,
235            "  Reduction: {:.6e} ({:.2}%)",
236            self.initial_cost - self.final_cost,
237            100.0 * (self.initial_cost - self.final_cost) / self.initial_cost.max(1e-12)
238        )?;
239        writeln!(f)?;
240        writeln!(f, "Iterations:")?;
241        writeln!(f, "  Total:              {}", self.iterations)?;
242        writeln!(
243            f,
244            "  Successful steps:   {} ({:.1}%)",
245            self.successful_steps,
246            100.0 * self.successful_steps as f64 / self.iterations.max(1) as f64
247        )?;
248        writeln!(
249            f,
250            "  Unsuccessful steps: {} ({:.1}%)",
251            self.unsuccessful_steps,
252            100.0 * self.unsuccessful_steps as f64 / self.iterations.max(1) as f64
253        )?;
254        writeln!(f)?;
255        writeln!(f, "Gradient:")?;
256        writeln!(f, "  Max norm:   {:.2e}", self.max_gradient_norm)?;
257        writeln!(f, "  Final norm: {:.2e}", self.final_gradient_norm)?;
258        writeln!(f)?;
259        writeln!(f, "Parameter Update:")?;
260        writeln!(f, "  Max norm:   {:.2e}", self.max_parameter_update_norm)?;
261        writeln!(f, "  Final norm: {:.2e}", self.final_parameter_update_norm)?;
262        writeln!(f)?;
263        writeln!(f, "Performance:")?;
264        writeln!(
265            f,
266            "  Total time:             {:.2}ms",
267            self.total_time.as_secs_f64() * 1000.0
268        )?;
269        writeln!(
270            f,
271            "  Average per iteration:  {:.2}ms",
272            self.average_time_per_iteration.as_secs_f64() * 1000.0
273        )?;
274
275        Ok(())
276    }
277}
278
279/// Per-iteration statistics for detailed logging (Ceres-style output).
280///
281/// Captures all relevant metrics for each optimization iteration, enabling
282/// detailed analysis and debugging of the optimization process.
283#[derive(Debug, Clone)]
284pub struct IterationStats {
285    /// Iteration number (0-indexed)
286    pub iteration: usize,
287    /// Cost function value at this iteration
288    pub cost: f64,
289    /// Change in cost from previous iteration
290    pub cost_change: f64,
291    /// L2 norm of the gradient (||J^T·r||)
292    pub gradient_norm: f64,
293    /// L2 norm of the parameter update step (||Δx||)
294    pub step_norm: f64,
295    /// Trust region ratio (ρ = actual_reduction / predicted_reduction)
296    pub tr_ratio: f64,
297    /// Trust region radius (damping parameter λ)
298    pub tr_radius: f64,
299    /// Linear solver iterations (0 for direct solvers like Cholesky)
300    pub ls_iter: usize,
301    /// Time taken for this iteration in milliseconds
302    pub iter_time_ms: f64,
303    /// Total elapsed time since optimization started in milliseconds
304    pub total_time_ms: f64,
305    /// Whether the step was accepted (true) or rejected (false)
306    pub accepted: bool,
307}
308
309impl IterationStats {
310    /// Print table header in Ceres-style format
311    pub fn print_header() {
312        debug!(
313            "{:>4}  {:>13}  {:>13}  {:>13}  {:>13}  {:>11}  {:>11}  {:>7}  {:>11}  {:>13}  {:>6}",
314            "iter",
315            "cost",
316            "cost_change",
317            "|gradient|",
318            "|step|",
319            "tr_ratio",
320            "tr_radius",
321            "ls_iter",
322            "iter_time",
323            "total_time",
324            "status"
325        );
326    }
327
328    /// Print single iteration line in Ceres-style format with scientific notation
329    pub fn print_line(&self) {
330        let status = if self.iteration == 0 {
331            "-"
332        } else if self.accepted {
333            "✓"
334        } else {
335            "✗"
336        };
337
338        debug!(
339            "{:>4}  {:>13.6e}  {:>13.2e}  {:>13.2e}  {:>13.2e}  {:>11.2e}  {:>11.2e}  {:>7}  {:>9.2}ms  {:>11.2}ms  {:>6}",
340            self.iteration,
341            self.cost,
342            self.cost_change,
343            self.gradient_norm,
344            self.step_norm,
345            self.tr_ratio,
346            self.tr_radius,
347            self.ls_iter,
348            self.iter_time_ms,
349            self.total_time_ms,
350            status
351        );
352    }
353}
354
355/// Configuration parameters for the Levenberg-Marquardt optimizer.
356///
357/// Controls the adaptive damping strategy, convergence criteria, and numerical stability
358/// enhancements for the Levenberg-Marquardt algorithm.
359///
360/// # Builder Pattern
361///
362/// All configuration options can be set using the builder pattern:
363///
364/// ```
365/// use apex_solver::optimizer::levenberg_marquardt::LevenbergMarquardtConfig;
366///
367/// let config = LevenbergMarquardtConfig::new()
368///     .with_max_iterations(100)
369///     .with_damping(1e-3)
370///     .with_damping_bounds(1e-12, 1e12)
371///     .with_jacobi_scaling(true);
372/// ```
373///
374/// # Damping Parameter Behavior
375///
376/// The damping parameter λ controls the trade-off between Gauss-Newton and gradient descent:
377///
378/// - **Initial damping** (`damping`): Starting value (default: 1e-4)
379/// - **Damping bounds** (`damping_min`, `damping_max`): Valid range (default: 1e-12 to 1e12)
380/// - **Adaptation**: Automatically adjusted based on step quality using Nielsen's formula
381///
382/// # Convergence Criteria
383///
384/// The optimizer terminates when ANY of the following conditions is met:
385///
386/// - **Cost tolerance**: `|cost_k - cost_{k-1}| < cost_tolerance`
387/// - **Parameter tolerance**: `||step|| < parameter_tolerance`
388/// - **Gradient tolerance**: `||J^T·r|| < gradient_tolerance`
389/// - **Maximum iterations**: `iteration >= max_iterations`
390/// - **Timeout**: `elapsed_time >= timeout`
391///
392/// # See Also
393///
394/// - [`LevenbergMarquardt`] - The solver that uses this configuration
395/// - [`GaussNewtonConfig`](crate::GaussNewtonConfig) - Undamped variant
396/// - [`DogLegConfig`](crate::DogLegConfig) - Trust region alternative
397#[derive(Clone)]
398pub struct LevenbergMarquardtConfig {
399    /// Type of linear solver for the linear systems
400    pub linear_solver_type: LinearSolverType,
401    /// Maximum number of iterations
402    pub max_iterations: usize,
403    /// Convergence tolerance for cost function
404    pub cost_tolerance: f64,
405    /// Convergence tolerance for parameter updates
406    pub parameter_tolerance: f64,
407    /// Convergence tolerance for gradient norm
408    pub gradient_tolerance: f64,
409    /// Timeout duration
410    pub timeout: Option<Duration>,
411    /// Initial damping parameter
412    pub damping: f64,
413    /// Minimum damping parameter
414    pub damping_min: f64,
415    /// Maximum damping parameter
416    pub damping_max: f64,
417    /// Damping increase factor (when step rejected)
418    pub damping_increase_factor: f64,
419    /// Damping decrease factor (when step accepted)
420    pub damping_decrease_factor: f64,
421    /// Damping nu parameter
422    pub damping_nu: f64,
423    /// Trust region radius
424    pub trust_region_radius: f64,
425    /// Minimum step quality for acceptance
426    pub min_step_quality: f64,
427    /// Good step quality threshold
428    pub good_step_quality: f64,
429    /// Minimum diagonal value for regularization
430    pub min_diagonal: f64,
431    /// Maximum diagonal value for regularization
432    pub max_diagonal: f64,
433    /// Minimum objective function cutoff (optional early termination)
434    ///
435    /// If set, optimization terminates when cost falls below this threshold.
436    /// Useful for early stopping when a "good enough" solution is acceptable.
437    ///
438    /// Default: None (disabled)
439    pub min_cost_threshold: Option<f64>,
440    /// Minimum trust region radius before termination
441    ///
442    /// When the trust region radius falls below this value, the optimizer
443    /// terminates as it indicates the search has converged or the problem
444    /// is ill-conditioned. Matches Ceres Solver's min_trust_region_radius.
445    ///
446    /// Default: 1e-32 (Ceres-compatible)
447    pub min_trust_region_radius: f64,
448    /// Maximum condition number for Jacobian matrix (optional check)
449    ///
450    /// If set, the optimizer checks if condition_number(J^T*J) exceeds this
451    /// threshold and terminates with IllConditionedJacobian status.
452    /// Note: Computing condition number is expensive, so this is disabled by default.
453    ///
454    /// Default: None (disabled)
455    pub max_condition_number: Option<f64>,
456    /// Minimum relative cost decrease for step acceptance
457    ///
458    /// Used in computing step quality (rho = actual_reduction / predicted_reduction).
459    /// Steps with rho < min_relative_decrease are rejected. Matches Ceres Solver's
460    /// min_relative_decrease parameter.
461    ///
462    /// Default: 1e-3 (Ceres-compatible)
463    pub min_relative_decrease: f64,
464    /// Use Jacobi column scaling (preconditioning)
465    ///
466    /// When enabled, normalizes Jacobian columns by their L2 norm before solving.
467    /// This can improve convergence for problems with mixed parameter scales
468    /// (e.g., positions in meters + angles in radians) but adds ~5-10% overhead.
469    ///
470    /// Default: false (to avoid performance overhead and faster convergence)
471    pub use_jacobi_scaling: bool,
472    /// Compute per-variable covariance matrices (uncertainty estimation)
473    ///
474    /// When enabled, computes covariance by inverting the Hessian matrix after
475    /// convergence. The full covariance matrix is extracted into per-variable
476    /// blocks stored in both Variable structs and SolverResult.
477    ///
478    /// Default: false (to avoid performance overhead)
479    pub compute_covariances: bool,
480    /// Schur complement solver variant (for bundle adjustment problems)
481    ///
482    /// When using LinearSolverType::SparseSchurComplement, this determines which
483    /// variant of the Schur complement method to use:
484    /// - Sparse: Direct sparse Cholesky factorization (most accurate, moderate speed)
485    /// - Iterative: Preconditioned Conjugate Gradients (memory efficient, good for large problems)
486    /// - PowerSeries: Power series approximation (fastest, less accurate)
487    ///
488    /// Default: Sparse
489    pub schur_variant: SchurVariant,
490    /// Schur complement preconditioner type
491    ///
492    /// Determines the preconditioning strategy for iterative Schur methods:
493    /// - Diagonal: Simple diagonal preconditioner (fast, less effective)
494    /// - BlockDiagonal: Block-diagonal preconditioner (balanced)
495    /// - IncompleteCholesky: Incomplete Cholesky factorization (slower, more effective)
496    ///
497    /// Default: Diagonal
498    pub schur_preconditioner: SchurPreconditioner,
499    // Note: Visualization is now handled via the observer pattern.
500    // Use `solver.add_observer(RerunObserver::new(true)?)` to enable visualization.
501    // This provides cleaner separation of concerns and allows multiple observers.
502}
503
504impl Default for LevenbergMarquardtConfig {
505    fn default() -> Self {
506        Self {
507            linear_solver_type: LinearSolverType::default(),
508            // Ceres Solver default: 50 (changed from 100 for compatibility)
509            max_iterations: 50,
510            // Ceres Solver default: 1e-6 (changed from 1e-8 for compatibility)
511            cost_tolerance: 1e-6,
512            // Ceres Solver default: 1e-8 (unchanged)
513            parameter_tolerance: 1e-8,
514            // Ceres Solver default: 1e-10 (changed from 1e-8 for compatibility)
515            // Note: Typically should be 1e-4 * cost_tolerance per Ceres docs
516            gradient_tolerance: 1e-10,
517            timeout: None,
518            damping: 1e-3, // Increased from 1e-4 for better initial convergence on BA
519            damping_min: 1e-12,
520            damping_max: 1e12,
521            damping_increase_factor: 10.0,
522            damping_decrease_factor: 0.3,
523            damping_nu: 2.0,
524            trust_region_radius: 1e4,
525            min_step_quality: 0.0,
526            good_step_quality: 0.75,
527            min_diagonal: 1e-6,
528            max_diagonal: 1e32,
529            // New Ceres-compatible parameters
530            min_cost_threshold: None,
531            min_trust_region_radius: 1e-32,
532            max_condition_number: None,
533            min_relative_decrease: 1e-3,
534            // Existing parameters
535            // Jacobi scaling disabled by default for Schur solvers (incompatible with block structure)
536            // Enable manually for Cholesky/QR solvers on mixed-scale problems
537            use_jacobi_scaling: false,
538            compute_covariances: false,
539            // Schur complement parameters
540            schur_variant: SchurVariant::default(),
541            schur_preconditioner: SchurPreconditioner::default(),
542        }
543    }
544}
545
546impl LevenbergMarquardtConfig {
547    /// Create a new Levenberg-Marquardt configuration with default values.
548    pub fn new() -> Self {
549        Self::default()
550    }
551
552    /// Set the linear solver type
553    pub fn with_linear_solver_type(mut self, linear_solver_type: LinearSolverType) -> Self {
554        self.linear_solver_type = linear_solver_type;
555        self
556    }
557
558    /// Set the maximum number of iterations
559    pub fn with_max_iterations(mut self, max_iterations: usize) -> Self {
560        self.max_iterations = max_iterations;
561        self
562    }
563
564    /// Set the cost tolerance
565    pub fn with_cost_tolerance(mut self, cost_tolerance: f64) -> Self {
566        self.cost_tolerance = cost_tolerance;
567        self
568    }
569
570    /// Set the parameter tolerance
571    pub fn with_parameter_tolerance(mut self, parameter_tolerance: f64) -> Self {
572        self.parameter_tolerance = parameter_tolerance;
573        self
574    }
575
576    /// Set the gradient tolerance
577    pub fn with_gradient_tolerance(mut self, gradient_tolerance: f64) -> Self {
578        self.gradient_tolerance = gradient_tolerance;
579        self
580    }
581
582    /// Set the timeout duration
583    pub fn with_timeout(mut self, timeout: Duration) -> Self {
584        self.timeout = Some(timeout);
585        self
586    }
587
588    /// Set the initial damping parameter.
589    pub fn with_damping(mut self, damping: f64) -> Self {
590        self.damping = damping;
591        self
592    }
593
594    /// Set the damping parameter bounds.
595    pub fn with_damping_bounds(mut self, min: f64, max: f64) -> Self {
596        self.damping_min = min;
597        self.damping_max = max;
598        self
599    }
600
601    /// Set the damping adjustment factors.
602    pub fn with_damping_factors(mut self, increase: f64, decrease: f64) -> Self {
603        self.damping_increase_factor = increase;
604        self.damping_decrease_factor = decrease;
605        self
606    }
607
608    /// Set the trust region parameters.
609    pub fn with_trust_region(mut self, radius: f64, min_quality: f64, good_quality: f64) -> Self {
610        self.trust_region_radius = radius;
611        self.min_step_quality = min_quality;
612        self.good_step_quality = good_quality;
613        self
614    }
615
616    /// Set minimum objective function cutoff for early termination.
617    ///
618    /// When set, optimization terminates with MinCostThresholdReached status
619    /// if the cost falls below this threshold. Useful for early stopping when
620    /// a "good enough" solution is acceptable.
621    pub fn with_min_cost_threshold(mut self, min_cost: f64) -> Self {
622        self.min_cost_threshold = Some(min_cost);
623        self
624    }
625
626    /// Set minimum trust region radius before termination.
627    ///
628    /// When the trust region radius falls below this value, optimization
629    /// terminates with TrustRegionRadiusTooSmall status.
630    /// Default: 1e-32 (Ceres-compatible)
631    pub fn with_min_trust_region_radius(mut self, min_radius: f64) -> Self {
632        self.min_trust_region_radius = min_radius;
633        self
634    }
635
636    /// Set maximum condition number for Jacobian matrix.
637    ///
638    /// If set, the optimizer checks if condition_number(J^T*J) exceeds this
639    /// threshold and terminates with IllConditionedJacobian status.
640    /// Note: Computing condition number is expensive, disabled by default.
641    pub fn with_max_condition_number(mut self, max_cond: f64) -> Self {
642        self.max_condition_number = Some(max_cond);
643        self
644    }
645
646    /// Set minimum relative cost decrease for step acceptance.
647    ///
648    /// Steps with rho = (actual_reduction / predicted_reduction) below this
649    /// threshold are rejected. Default: 1e-3 (Ceres-compatible)
650    pub fn with_min_relative_decrease(mut self, min_decrease: f64) -> Self {
651        self.min_relative_decrease = min_decrease;
652        self
653    }
654
655    /// Enable or disable Jacobi column scaling (preconditioning).
656    ///
657    /// When enabled, normalizes Jacobian columns by their L2 norm before solving.
658    /// Can improve convergence for mixed-scale problems but adds ~5-10% overhead.
659    pub fn with_jacobi_scaling(mut self, use_jacobi_scaling: bool) -> Self {
660        self.use_jacobi_scaling = use_jacobi_scaling;
661        self
662    }
663
664    /// Enable or disable covariance computation (uncertainty estimation).
665    ///
666    /// When enabled, computes the full covariance matrix by inverting the Hessian
667    /// after convergence, then extracts per-variable covariance blocks.
668    pub fn with_compute_covariances(mut self, compute_covariances: bool) -> Self {
669        self.compute_covariances = compute_covariances;
670        self
671    }
672
673    /// Set Schur complement solver variant
674    pub fn with_schur_variant(mut self, variant: SchurVariant) -> Self {
675        self.schur_variant = variant;
676        self
677    }
678
679    /// Set Schur complement preconditioner
680    pub fn with_schur_preconditioner(mut self, preconditioner: SchurPreconditioner) -> Self {
681        self.schur_preconditioner = preconditioner;
682        self
683    }
684
685    /// Configuration optimized for bundle adjustment problems.
686    ///
687    /// This preset uses settings tuned for large-scale bundle adjustment:
688    /// - **Schur complement solver** with iterative PCG (memory efficient)
689    /// - **Schur-Jacobi preconditioner** (Ceres-style, best PCG convergence)
690    /// - **Moderate initial damping** (1e-3) - not too aggressive
691    /// - **200 max iterations** (BA often needs more iterations for full convergence)
692    /// - **Very tight tolerances** matching Ceres Solver for accurate reconstruction
693    ///
694    /// This configuration matches Ceres Solver's recommended BA settings and
695    /// should achieve similar convergence quality.
696    ///
697    /// # Example
698    ///
699    /// ```
700    /// use apex_solver::optimizer::levenberg_marquardt::LevenbergMarquardtConfig;
701    ///
702    /// let config = LevenbergMarquardtConfig::for_bundle_adjustment();
703    /// ```
704    pub fn for_bundle_adjustment() -> Self {
705        Self::default()
706            .with_linear_solver_type(LinearSolverType::SparseSchurComplement)
707            .with_schur_variant(SchurVariant::Iterative)
708            .with_schur_preconditioner(SchurPreconditioner::SchurJacobi)
709            .with_damping(1e-3) // Moderate initial damping (Ceres default)
710            .with_max_iterations(20) // Reduced for early stop when RMSE < 1px
711            // Match Ceres tolerances for faster convergence
712            .with_cost_tolerance(1e-6) // Ceres function_tolerance (was 1e-12)
713            .with_parameter_tolerance(1e-8) // Ceres parameter_tolerance (was 1e-14)
714            .with_gradient_tolerance(1e-10) // Relaxed (was 1e-16)
715    }
716
717    /// Enable real-time visualization (graphical debugging).
718    ///
719    /// When enabled, optimization progress is logged to a Rerun viewer with:
720    /// - Time series plots of cost, gradient norm, damping, step quality
721    /// - Sparse Hessian matrix visualization as heat map
722    /// - Gradient vector visualization
723    /// - Real-time manifold state updates (for SE2/SE3 problems)
724    ///
725    /// **Note:** Requires the `visualization` feature to be enabled in `Cargo.toml`.
726    /// Use `verbose` for terminal logging.
727    ///
728    /// # Arguments
729    ///
730    /// * `enable` - Whether to enable visualization
731    // Note: with_visualization() method has been removed.
732    // Use the observer pattern instead:
733    //   let mut solver = LevenbergMarquardt::with_config(config);
734    //   solver.add_observer(RerunObserver::new(true)?);
735    // This provides cleaner separation and allows multiple observers.
736    ///   Print configuration parameters (verbose mode only)
737    pub fn print_configuration(&self) {
738        debug!(
739            "Configuration:\n  Solver:        Levenberg-Marquardt\n  Linear solver: {:?}\n  Convergence Criteria:\n  Max iterations:      {}\n  Cost tolerance:      {:.2e}\n  Parameter tolerance: {:.2e}\n  Gradient tolerance:  {:.2e}\n  Timeout:             {:?}\n  Damping Parameters:\n  Initial damping:     {:.2e}\n  Damping range:       [{:.2e}, {:.2e}]\n  Increase factor:     {:.2}\n  Decrease factor:     {:.2}\n  Trust Region:\n  Initial radius:      {:.2e}\n  Min step quality:    {:.2}\n  Good step quality:   {:.2}\n  Numerical Settings:\n  Jacobi scaling:      {}\n  Compute covariances: {}",
740            self.linear_solver_type,
741            self.max_iterations,
742            self.cost_tolerance,
743            self.parameter_tolerance,
744            self.gradient_tolerance,
745            self.timeout,
746            self.damping,
747            self.damping_min,
748            self.damping_max,
749            self.damping_increase_factor,
750            self.damping_decrease_factor,
751            self.trust_region_radius,
752            self.min_step_quality,
753            self.good_step_quality,
754            if self.use_jacobi_scaling {
755                "enabled"
756            } else {
757                "disabled"
758            },
759            if self.compute_covariances {
760                "enabled"
761            } else {
762                "disabled"
763            }
764        );
765    }
766}
767
768/// State for optimization iteration
769struct LinearizerResult {
770    variables: HashMap<String, VariableEnum>,
771    variable_index_map: HashMap<String, usize>,
772    sorted_vars: Vec<String>,
773    symbolic_structure: SymbolicStructure,
774    current_cost: f64,
775    initial_cost: f64,
776}
777
778/// Result from step computation
779struct StepResult {
780    step: Mat<f64>,
781    gradient_norm: f64,
782    predicted_reduction: f64,
783}
784
785/// Result from step evaluation
786struct StepEvaluation {
787    accepted: bool,
788    cost_reduction: f64,
789    rho: f64,
790}
791
792/// Levenberg-Marquardt solver for nonlinear least squares optimization.
793///
794/// Implements the damped Gauss-Newton method with adaptive damping parameter λ that
795/// interpolates between Gauss-Newton and gradient descent based on step quality.
796///
797/// # Algorithm
798///
799/// At each iteration k:
800/// 1. Compute residual `r(xₖ)` and Jacobian `J(xₖ)`
801/// 2. Solve augmented system: `(J^T·J + λI)·h = -J^T·r`
802/// 3. Evaluate step quality: `ρ = (actual reduction) / (predicted reduction)`
803/// 4. If `ρ > 0`: Accept step and update `xₖ₊₁ = xₖ ⊕ h`, decrease λ
804/// 5. If `ρ ≤ 0`: Reject step (keep `xₖ₊₁ = xₖ`), increase λ
805/// 6. Check convergence criteria
806///
807/// The damping parameter λ is updated using Nielsen's smooth formula:
808/// `λₖ₊₁ = λₖ · max(1/3, 1 - (2ρ - 1)³)` for accepted steps,
809/// or `λₖ₊₁ = λₖ · ν` (with increasing ν) for rejected steps.
810///
811/// # Examples
812///
813/// ```no_run
814/// use apex_solver::optimizer::levenberg_marquardt::{LevenbergMarquardtConfig, LevenbergMarquardt};
815/// use apex_solver::linalg::LinearSolverType;
816///
817/// # fn main() {
818/// let config = LevenbergMarquardtConfig::new()
819///     .with_max_iterations(100)
820///     .with_damping(1e-3)
821///     .with_damping_bounds(1e-12, 1e12)
822///     .with_jacobi_scaling(true);
823///
824/// let mut solver = LevenbergMarquardt::with_config(config);
825/// # }
826/// ```
827///
828/// # See Also
829///
830/// - [`LevenbergMarquardtConfig`] - Configuration options
831/// - [`GaussNewton`](crate::GaussNewton) - Undamped variant (faster but less robust)
832/// - [`DogLeg`](crate::DogLeg) - Alternative trust region method
833pub struct LevenbergMarquardt {
834    config: LevenbergMarquardtConfig,
835    jacobi_scaling: Option<SparseColMat<usize, f64>>,
836    observers: OptObserverVec,
837}
838
839impl Default for LevenbergMarquardt {
840    fn default() -> Self {
841        Self::new()
842    }
843}
844
845impl LevenbergMarquardt {
846    /// Create a new Levenberg-Marquardt solver with default configuration.
847    pub fn new() -> Self {
848        Self::with_config(LevenbergMarquardtConfig::default())
849    }
850
851    /// Create a new Levenberg-Marquardt solver with the given configuration.
852    pub fn with_config(config: LevenbergMarquardtConfig) -> Self {
853        Self {
854            config,
855            jacobi_scaling: None,
856            observers: OptObserverVec::new(),
857        }
858    }
859
860    /// Add an observer to monitor optimization progress.
861    ///
862    /// Observers are notified at each iteration with the current variable values.
863    /// This enables real-time visualization, logging, metrics collection, etc.
864    ///
865    /// # Arguments
866    ///
867    /// * `observer` - Any type implementing `OptObserver`
868    ///
869    /// # Examples
870    ///
871    /// ```no_run
872    /// use apex_solver::{LevenbergMarquardt, LevenbergMarquardtConfig};
873    /// # use apex_solver::core::problem::Problem;
874    /// # use std::collections::HashMap;
875    ///
876    /// # fn main() -> Result<(), Box<dyn std::error::Error>> {
877    /// let mut solver = LevenbergMarquardt::new();
878    ///
879    /// #[cfg(feature = "visualization")]
880    /// {
881    ///     use apex_solver::observers::RerunObserver;
882    ///     let rerun_observer = RerunObserver::new(true)?;
883    ///     solver.add_observer(rerun_observer);
884    /// }
885    ///
886    /// // ... optimize ...
887    /// # Ok(())
888    /// # }
889    /// ```
890    pub fn add_observer(&mut self, observer: impl crate::optimizer::OptObserver + 'static) {
891        self.observers.add(observer);
892    }
893
894    /// Update damping parameter based on step quality using trust region approach
895    /// Reference: Introduction to Optimization and Data Fitting
896    /// Algorithm 6.18
897    fn update_damping(&mut self, rho: f64) -> bool {
898        if rho > 0.0 {
899            // Step accepted - decrease damping
900            let coff = 2.0 * rho - 1.0;
901            self.config.damping *= (1.0_f64 / 3.0).max(1.0 - coff * coff * coff);
902            self.config.damping = self.config.damping.max(self.config.damping_min);
903            self.config.damping_nu = 2.0;
904            true
905        } else {
906            // Step rejected - increase damping
907            self.config.damping *= self.config.damping_nu;
908            self.config.damping_nu *= 2.0;
909            self.config.damping = self.config.damping.min(self.config.damping_max);
910            false
911        }
912    }
913
914    /// Compute step quality ratio (actual vs predicted reduction)
915    /// Reference: Introduction to Optimization and Data Fitting
916    /// Reference: Damping parameter in marquardt's method
917    /// Formula 2.2
918    fn compute_step_quality(
919        &self,
920        current_cost: f64,
921        new_cost: f64,
922        predicted_reduction: f64,
923    ) -> f64 {
924        let actual_reduction = current_cost - new_cost;
925        if predicted_reduction.abs() < 1e-15 {
926            if actual_reduction > 0.0 { 1.0 } else { 0.0 }
927        } else {
928            actual_reduction / predicted_reduction
929        }
930    }
931
932    /// Compute predicted cost reduction from linear model
933    /// Standard LM formula: 0.5 * step^T * (damping * step - gradient)
934    fn compute_predicted_reduction(&self, step: &Mat<f64>, gradient: &Mat<f64>) -> f64 {
935        // Standard Levenberg-Marquardt predicted reduction formula
936        // predicted_reduction = -step^T * gradient - 0.5 * step^T * H * step
937        //                     = 0.5 * step^T * (damping * step - gradient)
938        let diff = self.config.damping * step - gradient;
939        (0.5 * step.transpose() * &diff)[(0, 0)]
940    }
941
942    /// Check convergence criteria
943    /// Check convergence using comprehensive termination criteria.
944    ///
945    /// Implements 9 termination criteria following Ceres Solver standards:
946    ///
947    /// 1. **Gradient Norm (First-Order Optimality)**: ||g||∞ ≤ gradient_tolerance
948    /// 2. **Parameter Change Tolerance**: ||h|| ≤ parameter_tolerance * (||x|| + parameter_tolerance)
949    /// 3. **Function Value Change Tolerance**: |ΔF| < cost_tolerance * F
950    /// 4. **Objective Function Cutoff**: F_new < min_cost_threshold (optional)
951    /// 5. **Trust Region Radius**: radius < min_trust_region_radius
952    /// 6. **Singular/Ill-Conditioned Jacobian**: Detected during linear solve
953    /// 7. **Invalid Numerical Values**: NaN or Inf in cost or parameters
954    /// 8. **Maximum Iterations**: iteration >= max_iterations
955    /// 9. **Timeout**: elapsed >= timeout
956    ///
957    /// # Arguments
958    ///
959    /// * `iteration` - Current iteration number
960    /// * `current_cost` - Cost before applying the step
961    /// * `new_cost` - Cost after applying the step
962    /// * `parameter_norm` - L2 norm of current parameter vector ||x||
963    /// * `parameter_update_norm` - L2 norm of parameter update step ||h||
964    /// * `gradient_norm` - Infinity norm of gradient ||g||∞
965    /// * `trust_region_radius` - Current trust region radius
966    /// * `elapsed` - Elapsed time since optimization start
967    /// * `step_accepted` - Whether the current step was accepted
968    ///
969    /// # Returns
970    ///
971    /// `Some(OptimizationStatus)` if any termination criterion is satisfied, `None` otherwise.
972    ///
973    /// # Design Note
974    /// This internal convergence checker requires multiple scalar metrics for comprehensive
975    /// termination criteria. Grouping into a struct would reduce clarity without performance benefit.
976    #[allow(clippy::too_many_arguments)]
977    fn check_convergence(
978        &self,
979        iteration: usize,
980        current_cost: f64,
981        new_cost: f64,
982        parameter_norm: f64,
983        parameter_update_norm: f64,
984        gradient_norm: f64,
985        trust_region_radius: f64,
986        elapsed: Duration,
987        step_accepted: bool,
988    ) -> Option<OptimizationStatus> {
989        // CRITICAL SAFETY CHECKS (perform first, before convergence checks)
990
991        // CRITERION 7: Invalid Numerical Values (NaN/Inf)
992        // Always check for numerical instability first
993        if !new_cost.is_finite() || !parameter_update_norm.is_finite() || !gradient_norm.is_finite()
994        {
995            return Some(OptimizationStatus::InvalidNumericalValues);
996        }
997
998        // CRITERION 9: Timeout
999        // Check wall-clock time limit
1000        if let Some(timeout) = self.config.timeout
1001            && elapsed >= timeout
1002        {
1003            return Some(OptimizationStatus::Timeout);
1004        }
1005
1006        // CRITERION 8: Maximum Iterations
1007        // Check iteration count limit
1008        if iteration >= self.config.max_iterations {
1009            return Some(OptimizationStatus::MaxIterationsReached);
1010        }
1011        // CONVERGENCE CRITERIA (only check after successful steps)
1012
1013        // Only check convergence criteria after accepted steps
1014        // (rejected steps don't indicate convergence)
1015        if !step_accepted {
1016            return None;
1017        }
1018
1019        // CRITERION 1: Gradient Norm (First-Order Optimality)
1020        // Check if gradient infinity norm is below threshold
1021        // This indicates we're at a critical point (local minimum, saddle, or maximum)
1022        if gradient_norm < self.config.gradient_tolerance {
1023            return Some(OptimizationStatus::GradientToleranceReached);
1024        }
1025
1026        // Only check parameter and cost criteria after first iteration
1027        if iteration > 0 {
1028            // CRITERION 2: Parameter Change Tolerance (xtol)
1029            // Ceres formula: ||h|| ≤ ε_param * (||x|| + ε_param)
1030            // This is a relative measure that scales with parameter magnitude
1031            let relative_step_tolerance = self.config.parameter_tolerance
1032                * (parameter_norm + self.config.parameter_tolerance);
1033
1034            if parameter_update_norm <= relative_step_tolerance {
1035                return Some(OptimizationStatus::ParameterToleranceReached);
1036            }
1037
1038            // CRITERION 3: Function Value Change Tolerance (ftol)
1039            // Ceres formula: |ΔF| < ε_cost * F
1040            // Check relative cost change (not absolute)
1041            let cost_change = (current_cost - new_cost).abs();
1042            let relative_cost_change = cost_change / current_cost.max(1e-10); // Avoid division by zero
1043
1044            if relative_cost_change < self.config.cost_tolerance {
1045                return Some(OptimizationStatus::CostToleranceReached);
1046            }
1047        }
1048
1049        // CRITERION 4: Objective Function Cutoff (optional early stopping)
1050        // Useful for "good enough" solutions
1051        if let Some(min_cost) = self.config.min_cost_threshold
1052            && new_cost < min_cost
1053        {
1054            return Some(OptimizationStatus::MinCostThresholdReached);
1055        }
1056
1057        // CRITERION 5: Trust Region Radius
1058        // If trust region has collapsed, optimization has converged or problem is ill-conditioned
1059        if trust_region_radius < self.config.min_trust_region_radius {
1060            return Some(OptimizationStatus::TrustRegionRadiusTooSmall);
1061        }
1062
1063        // CRITERION 6: Singular/Ill-Conditioned Jacobian
1064        // Note: This is typically detected during the linear solve and handled there
1065        // The max_condition_number check would be expensive to compute here
1066        // If linear solve fails, it returns an error that's converted to NumericalFailure
1067
1068        // No termination criterion satisfied
1069        None
1070    }
1071
1072    /// Compute total parameter vector norm ||x||.
1073    ///
1074    /// Computes the L2 norm of all parameter vectors concatenated together.
1075    /// This is used in the relative parameter tolerance check.
1076    ///
1077    /// # Arguments
1078    ///
1079    /// * `variables` - Map of variable names to their current values
1080    ///
1081    /// # Returns
1082    ///
1083    /// The L2 norm of the concatenated parameter vector
1084    fn compute_parameter_norm(variables: &HashMap<String, VariableEnum>) -> f64 {
1085        variables
1086            .values()
1087            .map(|v| {
1088                let vec = v.to_vector();
1089                vec.norm_squared()
1090            })
1091            .sum::<f64>()
1092            .sqrt()
1093    }
1094
1095    /// Create Jacobi scaling matrix from Jacobian
1096    fn create_jacobi_scaling(
1097        &self,
1098        jacobian: &SparseColMat<usize, f64>,
1099    ) -> Result<SparseColMat<usize, f64>, OptimizerError> {
1100        let cols = jacobian.ncols();
1101        let jacobi_scaling_vec: Vec<Triplet<usize, usize, f64>> = (0..cols)
1102            .map(|c| {
1103                // Compute column norm: sqrt(sum(J_col^2))
1104                let col_norm_squared: f64 = jacobian
1105                    .triplet_iter()
1106                    .filter(|t| t.col == c)
1107                    .map(|t| t.val * t.val)
1108                    .sum();
1109                let col_norm = col_norm_squared.sqrt();
1110                // Scaling factor: 1.0 / (1.0 + col_norm)
1111                let scaling = 1.0 / (1.0 + col_norm);
1112                Triplet::new(c, c, scaling)
1113            })
1114            .collect();
1115
1116        SparseColMat::try_new_from_triplets(cols, cols, &jacobi_scaling_vec)
1117            .map_err(|e| OptimizerError::JacobiScalingCreation(e.to_string()).log_with_source(e))
1118    }
1119
1120    /// Initialize optimization state from problem and initial parameters
1121    fn initialize_optimization_state(
1122        &self,
1123        problem: &Problem,
1124        initial_params: &HashMap<String, (ManifoldType, DVector<f64>)>,
1125    ) -> Result<LinearizerResult, error::ApexSolverError> {
1126        // Initialize variables from initial values
1127        let variables = problem.initialize_variables(initial_params);
1128
1129        // Create column mapping for variables
1130        let mut variable_index_map = HashMap::new();
1131        let mut col_offset = 0;
1132        let mut sorted_vars: Vec<String> = variables.keys().cloned().collect();
1133        sorted_vars.sort();
1134
1135        for var_name in &sorted_vars {
1136            variable_index_map.insert(var_name.clone(), col_offset);
1137            col_offset += variables[var_name].get_size();
1138        }
1139
1140        // Build symbolic structure for sparse operations
1141        let symbolic_structure =
1142            problem.build_symbolic_structure(&variables, &variable_index_map, col_offset)?;
1143
1144        // Initial cost evaluation (residual only, no Jacobian needed)
1145        let residual = problem.compute_residual_sparse(&variables)?;
1146        let current_cost = compute_cost(&residual);
1147        let initial_cost = current_cost;
1148
1149        Ok(LinearizerResult {
1150            variables,
1151            variable_index_map,
1152            sorted_vars,
1153            symbolic_structure,
1154            current_cost,
1155            initial_cost,
1156        })
1157    }
1158
1159    /// Process Jacobian by creating and applying Jacobi scaling if enabled
1160    fn process_jacobian(
1161        &mut self,
1162        jacobian: &SparseColMat<usize, f64>,
1163        iteration: usize,
1164    ) -> Result<SparseColMat<usize, f64>, OptimizerError> {
1165        // Create Jacobi scaling on first iteration if enabled
1166        if iteration == 0 {
1167            let scaling = self.create_jacobi_scaling(jacobian)?;
1168            self.jacobi_scaling = Some(scaling);
1169        }
1170        let scaling = self
1171            .jacobi_scaling
1172            .as_ref()
1173            .ok_or_else(|| OptimizerError::JacobiScalingNotInitialized.log())?;
1174        Ok(jacobian * scaling)
1175    }
1176
1177    /// Compute optimization step by solving the augmented system
1178    fn compute_levenberg_marquardt_step(
1179        &self,
1180        residuals: &Mat<f64>,
1181        scaled_jacobian: &SparseColMat<usize, f64>,
1182        linear_solver: &mut Box<dyn SparseLinearSolver>,
1183    ) -> Result<StepResult, OptimizerError> {
1184        // Solve augmented equation: (J_scaled^T * J_scaled + λI) * dx_scaled = -J_scaled^T * r
1185        let residuals_owned = residuals.as_ref().to_owned();
1186        let scaled_step = linear_solver
1187            .solve_augmented_equation(&residuals_owned, scaled_jacobian, self.config.damping)
1188            .map_err(|e| OptimizerError::LinearSolveFailed(e.to_string()).log_with_source(e))?;
1189
1190        // Get cached gradient and Hessian from the solver
1191        let gradient = linear_solver.get_gradient().ok_or_else(|| {
1192            OptimizerError::NumericalInstability("Gradient not available".into()).log()
1193        })?;
1194        let _hessian = linear_solver.get_hessian().ok_or_else(|| {
1195            OptimizerError::NumericalInstability("Hessian not available".into()).log()
1196        })?;
1197        let gradient_norm = gradient.norm_l2();
1198
1199        // Apply inverse Jacobi scaling to get final step (if enabled)
1200        let step = if self.config.use_jacobi_scaling {
1201            let scaling = self
1202                .jacobi_scaling
1203                .as_ref()
1204                .ok_or_else(|| OptimizerError::JacobiScalingNotInitialized.log())?;
1205            &scaled_step * scaling
1206        } else {
1207            scaled_step
1208        };
1209
1210        // Compute predicted reduction using scaled values
1211        let predicted_reduction = self.compute_predicted_reduction(&step, gradient);
1212
1213        Ok(StepResult {
1214            step,
1215            gradient_norm,
1216            predicted_reduction,
1217        })
1218    }
1219
1220    /// Evaluate and apply step, handling acceptance/rejection based on step quality
1221    fn evaluate_and_apply_step(
1222        &mut self,
1223        step_result: &StepResult,
1224        state: &mut LinearizerResult,
1225        problem: &Problem,
1226    ) -> error::ApexSolverResult<StepEvaluation> {
1227        // Apply parameter updates using manifold operations
1228        let _step_norm = apply_parameter_step(
1229            &mut state.variables,
1230            step_result.step.as_ref(),
1231            &state.sorted_vars,
1232        );
1233
1234        // Compute new cost (residual only, no Jacobian needed for step evaluation)
1235        let new_residual = problem.compute_residual_sparse(&state.variables)?;
1236        let new_cost = compute_cost(&new_residual);
1237
1238        // Compute step quality
1239        let rho = self.compute_step_quality(
1240            state.current_cost,
1241            new_cost,
1242            step_result.predicted_reduction,
1243        );
1244
1245        // Update damping and decide whether to accept step
1246        let accepted = self.update_damping(rho);
1247
1248        let cost_reduction = if accepted {
1249            // Accept the step - parameters already updated
1250            let reduction = state.current_cost - new_cost;
1251            state.current_cost = new_cost;
1252            reduction
1253        } else {
1254            // Reject the step - revert parameter changes
1255            apply_negative_parameter_step(
1256                &mut state.variables,
1257                step_result.step.as_ref(),
1258                &state.sorted_vars,
1259            );
1260            0.0
1261        };
1262
1263        Ok(StepEvaluation {
1264            accepted,
1265            cost_reduction,
1266            rho,
1267        })
1268    }
1269
1270    /// Create optimization summary
1271    ///
1272    /// # Design Note
1273    /// This internal summary builder accepts individual scalar results from the optimization loop.
1274    /// A struct parameter would not improve readability for this private method.
1275    #[allow(clippy::too_many_arguments)]
1276    fn create_summary(
1277        &self,
1278        initial_cost: f64,
1279        final_cost: f64,
1280        rho: f64,
1281        iterations: usize,
1282        successful_steps: usize,
1283        unsuccessful_steps: usize,
1284        max_gradient_norm: f64,
1285        final_gradient_norm: f64,
1286        max_parameter_update_norm: f64,
1287        final_parameter_update_norm: f64,
1288        total_cost_reduction: f64,
1289        total_time: Duration,
1290        iteration_history: Vec<IterationStats>,
1291        convergence_status: OptimizationStatus,
1292    ) -> LevenbergMarquardtSummary {
1293        LevenbergMarquardtSummary {
1294            initial_cost,
1295            final_cost,
1296            rho,
1297            iterations,
1298            successful_steps,
1299            unsuccessful_steps,
1300            final_damping: self.config.damping,
1301            average_cost_reduction: if iterations > 0 {
1302                total_cost_reduction / iterations as f64
1303            } else {
1304                0.0
1305            },
1306            max_gradient_norm,
1307            final_gradient_norm,
1308            max_parameter_update_norm,
1309            final_parameter_update_norm,
1310            total_time,
1311            average_time_per_iteration: if iterations > 0 {
1312                total_time / iterations as u32
1313            } else {
1314                Duration::from_secs(0)
1315            },
1316            iteration_history,
1317            convergence_status,
1318        }
1319    }
1320
1321    pub fn optimize(
1322        &mut self,
1323        problem: &Problem,
1324        initial_params: &HashMap<String, (ManifoldType, DVector<f64>)>,
1325    ) -> Result<SolverResult<HashMap<String, VariableEnum>>, error::ApexSolverError> {
1326        let start_time = Instant::now();
1327        let mut iteration = 0;
1328        let mut cost_evaluations = 1; // Initial cost evaluation
1329        let mut jacobian_evaluations = 0;
1330        let mut successful_steps = 0;
1331        let mut unsuccessful_steps = 0;
1332
1333        // Initialize optimization state
1334        let mut state = self.initialize_optimization_state(problem, initial_params)?;
1335
1336        // Create linear solver - must be after variable initialization for Schur solver
1337        let mut linear_solver: Box<dyn SparseLinearSolver> = match self.config.linear_solver_type {
1338            LinearSolverType::SparseCholesky => Box::new(SparseCholeskySolver::new()),
1339            LinearSolverType::SparseQR => Box::new(SparseQRSolver::new()),
1340            LinearSolverType::SparseSchurComplement => Box::new(
1341                SchurSolverAdapter::new_with_structure_and_config(
1342                    &state.variables,
1343                    &state.variable_index_map,
1344                    self.config.schur_variant,
1345                    self.config.schur_preconditioner,
1346                )
1347                .map_err(|e| {
1348                    OptimizerError::LinearSolveFailed(format!(
1349                        "Failed to initialize Schur solver: {}",
1350                        e
1351                    ))
1352                    .log()
1353                })?,
1354            ),
1355        };
1356
1357        // Initialize summary tracking variables
1358        let mut max_gradient_norm: f64 = 0.0;
1359        let mut max_parameter_update_norm: f64 = 0.0;
1360        let mut total_cost_reduction = 0.0;
1361        let mut final_gradient_norm;
1362        let mut final_parameter_update_norm;
1363
1364        // Initialize iteration statistics tracking
1365        let mut iteration_stats = Vec::with_capacity(self.config.max_iterations);
1366        let mut previous_cost = state.current_cost;
1367
1368        // Print configuration and header if debug level is enabled
1369        if tracing::enabled!(tracing::Level::DEBUG) {
1370            self.config.print_configuration();
1371            IterationStats::print_header();
1372        }
1373
1374        // Main optimization loop
1375        loop {
1376            let iter_start = Instant::now();
1377            // Evaluate residuals and Jacobian
1378            let (residuals, jacobian) = problem.compute_residual_and_jacobian_sparse(
1379                &state.variables,
1380                &state.variable_index_map,
1381                &state.symbolic_structure,
1382            )?;
1383            jacobian_evaluations += 1;
1384
1385            // Process Jacobian (apply scaling if enabled)
1386            let scaled_jacobian = if self.config.use_jacobi_scaling {
1387                self.process_jacobian(&jacobian, iteration)?
1388            } else {
1389                jacobian
1390            };
1391
1392            // Compute optimization step
1393            let step_result = self.compute_levenberg_marquardt_step(
1394                &residuals,
1395                &scaled_jacobian,
1396                &mut linear_solver,
1397            )?;
1398
1399            // Update tracking variables
1400            max_gradient_norm = max_gradient_norm.max(step_result.gradient_norm);
1401            final_gradient_norm = step_result.gradient_norm;
1402            let step_norm = step_result.step.norm_l2();
1403            max_parameter_update_norm = max_parameter_update_norm.max(step_norm);
1404            final_parameter_update_norm = step_norm;
1405
1406            // Evaluate and apply step (handles accept/reject)
1407            let step_eval = self.evaluate_and_apply_step(&step_result, &mut state, problem)?;
1408            cost_evaluations += 1;
1409
1410            // Update counters based on acceptance
1411            if step_eval.accepted {
1412                successful_steps += 1;
1413                total_cost_reduction += step_eval.cost_reduction;
1414            } else {
1415                unsuccessful_steps += 1;
1416            }
1417
1418            // OPTIMIZATION: Only collect iteration statistics if debug level is enabled
1419            // This eliminates ~2-5ms overhead per iteration for non-debug optimization
1420            if tracing::enabled!(tracing::Level::DEBUG) {
1421                let iter_elapsed_ms = iter_start.elapsed().as_secs_f64() * 1000.0;
1422                let total_elapsed_ms = start_time.elapsed().as_secs_f64() * 1000.0;
1423
1424                let stats = IterationStats {
1425                    iteration,
1426                    cost: state.current_cost,
1427                    cost_change: previous_cost - state.current_cost,
1428                    gradient_norm: step_result.gradient_norm,
1429                    step_norm,
1430                    tr_ratio: step_eval.rho,
1431                    tr_radius: self.config.damping,
1432                    ls_iter: 0, // Direct solver (Cholesky) has no iterations
1433                    iter_time_ms: iter_elapsed_ms,
1434                    total_time_ms: total_elapsed_ms,
1435                    accepted: step_eval.accepted,
1436                };
1437
1438                iteration_stats.push(stats.clone());
1439                stats.print_line();
1440            }
1441
1442            previous_cost = state.current_cost;
1443
1444            // Notify all observers with current state
1445            // First set metrics data, then notify observers
1446            self.observers.set_iteration_metrics(
1447                state.current_cost,
1448                step_result.gradient_norm,
1449                Some(self.config.damping),
1450                step_norm,
1451                Some(step_eval.rho),
1452            );
1453
1454            // Set matrix data if available and there are observers
1455            if !self.observers.is_empty()
1456                && let (Some(hessian), Some(gradient)) =
1457                    (linear_solver.get_hessian(), linear_solver.get_gradient())
1458            {
1459                self.observers
1460                    .set_matrix_data(Some(hessian.clone()), Some(gradient.clone()));
1461            }
1462
1463            // Notify observers with current variable values and iteration number
1464            self.observers.notify(&state.variables, iteration);
1465
1466            // Check convergence
1467            let elapsed = start_time.elapsed();
1468
1469            // Compute parameter norm for relative parameter tolerance check
1470            let parameter_norm = Self::compute_parameter_norm(&state.variables);
1471
1472            // Compute new cost for convergence check (state may already have new cost if step accepted)
1473            let new_cost = if step_eval.accepted {
1474                state.current_cost
1475            } else {
1476                // Use cost before step application
1477                state.current_cost
1478            };
1479
1480            // Cost before this step (need to add back reduction if step was accepted)
1481            let cost_before_step = if step_eval.accepted {
1482                state.current_cost + step_eval.cost_reduction
1483            } else {
1484                state.current_cost
1485            };
1486
1487            if let Some(status) = self.check_convergence(
1488                iteration,
1489                cost_before_step,
1490                new_cost,
1491                parameter_norm,
1492                step_norm,
1493                step_result.gradient_norm,
1494                self.config.trust_region_radius,
1495                elapsed,
1496                step_eval.accepted,
1497            ) {
1498                let summary = self.create_summary(
1499                    state.initial_cost,
1500                    state.current_cost,
1501                    step_eval.rho,
1502                    iteration + 1,
1503                    successful_steps,
1504                    unsuccessful_steps,
1505                    max_gradient_norm,
1506                    final_gradient_norm,
1507                    max_parameter_update_norm,
1508                    final_parameter_update_norm,
1509                    total_cost_reduction,
1510                    elapsed,
1511                    iteration_stats.clone(),
1512                    status.clone(),
1513                );
1514
1515                // Print summary only if debug level is enabled
1516                if tracing::enabled!(tracing::Level::DEBUG) {
1517                    debug!("{}", summary);
1518                }
1519
1520                // Compute covariances if enabled
1521                let covariances = if self.config.compute_covariances {
1522                    problem.compute_and_set_covariances(
1523                        &mut linear_solver,
1524                        &mut state.variables,
1525                        &state.variable_index_map,
1526                    )
1527                } else {
1528                    None
1529                };
1530
1531                // Convert to HashMap for result
1532                let final_parameters: HashMap<String, VariableEnum> =
1533                    state.variables.into_iter().collect();
1534
1535                // Notify observers that optimization is complete
1536                self.observers
1537                    .notify_complete(&final_parameters, iteration + 1);
1538
1539                return Ok(SolverResult {
1540                    status,
1541                    iterations: iteration + 1,
1542                    initial_cost: state.initial_cost,
1543                    final_cost: state.current_cost,
1544                    parameters: final_parameters,
1545                    elapsed_time: elapsed,
1546                    convergence_info: Some(ConvergenceInfo {
1547                        final_gradient_norm,
1548                        final_parameter_update_norm,
1549                        cost_evaluations,
1550                        jacobian_evaluations,
1551                    }),
1552                    covariances,
1553                });
1554            }
1555
1556            // Note: Max iterations and timeout checks are now handled inside check_convergence()
1557
1558            iteration += 1;
1559        }
1560    }
1561}
1562// Implement Solver trait
1563impl Solver for LevenbergMarquardt {
1564    type Config = LevenbergMarquardtConfig;
1565    type Error = error::ApexSolverError;
1566
1567    fn new() -> Self {
1568        Self::default()
1569    }
1570
1571    fn optimize(
1572        &mut self,
1573        problem: &Problem,
1574        initial_params: &HashMap<String, (ManifoldType, DVector<f64>)>,
1575    ) -> Result<SolverResult<HashMap<String, VariableEnum>>, Self::Error> {
1576        self.optimize(problem, initial_params)
1577    }
1578}
1579
1580#[cfg(test)]
1581mod tests {
1582    use super::*;
1583    use crate::factors::Factor;
1584    use nalgebra::{DMatrix, dvector};
1585    /// Custom Rosenbrock Factor 1: r1 = 10(x2 - x1²)
1586    /// Demonstrates extensibility - custom factors can be defined outside of factors.rs
1587    #[derive(Debug, Clone)]
1588    struct RosenbrockFactor1;
1589
1590    impl Factor for RosenbrockFactor1 {
1591        fn linearize(
1592            &self,
1593            params: &[DVector<f64>],
1594            compute_jacobian: bool,
1595        ) -> (DVector<f64>, Option<DMatrix<f64>>) {
1596            let x1 = params[0][0];
1597            let x2 = params[1][0];
1598
1599            // Residual: r1 = 10(x2 - x1²)
1600            let residual = dvector![10.0 * (x2 - x1 * x1)];
1601
1602            let jacobian = if compute_jacobian {
1603                // Jacobian: ∂r1/∂x1 = -20*x1, ∂r1/∂x2 = 10
1604                let mut jacobian = DMatrix::zeros(1, 2);
1605                jacobian[(0, 0)] = -20.0 * x1;
1606                jacobian[(0, 1)] = 10.0;
1607
1608                Some(jacobian)
1609            } else {
1610                None
1611            };
1612
1613            (residual, jacobian)
1614        }
1615
1616        fn get_dimension(&self) -> usize {
1617            1
1618        }
1619    }
1620
1621    /// Custom Rosenbrock Factor 2: r2 = 1 - x1
1622    /// Demonstrates extensibility - custom factors can be defined outside of factors.rs
1623    #[derive(Debug, Clone)]
1624    struct RosenbrockFactor2;
1625
1626    impl Factor for RosenbrockFactor2 {
1627        fn linearize(
1628            &self,
1629            params: &[DVector<f64>],
1630            compute_jacobian: bool,
1631        ) -> (DVector<f64>, Option<DMatrix<f64>>) {
1632            let x1 = params[0][0];
1633
1634            // Residual: r2 = 1 - x1
1635            let residual = dvector![1.0 - x1];
1636
1637            let jacobian = if compute_jacobian {
1638                // Jacobian: ∂r2/∂x1 = -1
1639                Some(DMatrix::from_element(1, 1, -1.0))
1640            } else {
1641                None
1642            };
1643
1644            (residual, jacobian)
1645        }
1646
1647        fn get_dimension(&self) -> usize {
1648            1
1649        }
1650    }
1651
1652    #[test]
1653    fn test_rosenbrock_optimization() -> Result<(), Box<dyn std::error::Error>> {
1654        // Rosenbrock function test:
1655        // Minimize: r1² + r2² where
1656        //   r1 = 10(x2 - x1²)
1657        //   r2 = 1 - x1
1658        // Starting point: [-1.2, 1.0]
1659        // Expected minimum: [1.0, 1.0]
1660
1661        let mut problem = Problem::new();
1662        let mut initial_values = HashMap::new();
1663
1664        // Add variables using Rn manifold (Euclidean space)
1665        initial_values.insert("x1".to_string(), (ManifoldType::RN, dvector![-1.2]));
1666        initial_values.insert("x2".to_string(), (ManifoldType::RN, dvector![1.0]));
1667
1668        // Add custom factors (demonstrates extensibility!)
1669        problem.add_residual_block(&["x1", "x2"], Box::new(RosenbrockFactor1), None);
1670        problem.add_residual_block(&["x1"], Box::new(RosenbrockFactor2), None);
1671
1672        // Configure Levenberg-Marquardt optimizer
1673        let config = LevenbergMarquardtConfig::new()
1674            .with_max_iterations(100)
1675            .with_cost_tolerance(1e-8)
1676            .with_parameter_tolerance(1e-8)
1677            .with_gradient_tolerance(1e-10);
1678
1679        let mut solver = LevenbergMarquardt::with_config(config);
1680        let result = solver.optimize(&problem, &initial_values)?;
1681
1682        // Extract final values
1683        let x1_final = result
1684            .parameters
1685            .get("x1")
1686            .ok_or("x1 not found")?
1687            .to_vector()[0];
1688        let x2_final = result
1689            .parameters
1690            .get("x2")
1691            .ok_or("x2 not found")?
1692            .to_vector()[0];
1693
1694        // Verify convergence to [1.0, 1.0]
1695        assert!(
1696            matches!(
1697                result.status,
1698                OptimizationStatus::Converged
1699                    | OptimizationStatus::CostToleranceReached
1700                    | OptimizationStatus::ParameterToleranceReached
1701                    | OptimizationStatus::GradientToleranceReached
1702            ),
1703            "Optimization should converge"
1704        );
1705        assert!(
1706            (x1_final - 1.0).abs() < 1e-4,
1707            "x1 should converge to 1.0, got {}",
1708            x1_final
1709        );
1710        assert!(
1711            (x2_final - 1.0).abs() < 1e-4,
1712            "x2 should converge to 1.0, got {}",
1713            x2_final
1714        );
1715        assert!(
1716            result.final_cost < 1e-6,
1717            "Final cost should be near zero, got {}",
1718            result.final_cost
1719        );
1720        Ok(())
1721    }
1722}