apex_solver/optimizer/
dog_leg.rs

1//! Dog Leg trust region optimization algorithm implementation.
2//!
3//! The Dog Leg method is a robust trust region algorithm for solving nonlinear least squares problems:
4//!
5//! ```text
6//! min f(x) = ½||r(x)||² = ½Σᵢ rᵢ(x)²
7//! ```
8//!
9//! where `r: ℝⁿ → ℝᵐ` is the residual vector function.
10//!
11//! # Algorithm Overview
12//!
13//! Powell's Dog Leg method constructs a piecewise linear path within a spherical trust region
14//! of radius Δ, connecting three key points:
15//!
16//! 1. **Origin** (current position)
17//! 2. **Cauchy Point** p_c = -α·g (optimal steepest descent step)
18//! 3. **Gauss-Newton Point** h_gn (full Newton step solving J^T·J·h = -J^T·r)
19//!
20//! The "dog leg" path travels from the origin to the Cauchy point, then from the Cauchy point
21//! toward the Gauss-Newton point, stopping at the trust region boundary.
22//!
23//! ## Step Selection Strategy
24//!
25//! Given trust region radius Δ, the algorithm selects a step h based on three cases:
26//!
27//! **Case 1: GN step inside trust region** (`||h_gn|| ≤ Δ`)
28//! ```text
29//! h = h_gn  (full Gauss-Newton step)
30//! ```
31//!
32//! **Case 2: Even Cauchy point outside trust region** (`||p_c|| ≥ Δ`)
33//! ```text
34//! h = (Δ / ||g||) · (-g)  (scaled steepest descent to boundary)
35//! ```
36//!
37//! **Case 3: Dog leg interpolation** (`||p_c|| < Δ < ||h_gn||`)
38//! ```text
39//! h(β) = p_c + β·(h_gn - p_c),  where β ∈ [0,1] satisfies ||h(β)|| = Δ
40//! ```
41//!
42//! ## Cauchy Point Computation
43//!
44//! The Cauchy point is the optimal step along the steepest descent direction -g:
45//!
46//! ```text
47//! α = (g^T·g) / (g^T·H·g)  where H = J^T·J
48//! p_c = -α·g
49//! ```
50//!
51//! This minimizes the quadratic model along the gradient direction.
52//!
53//! ## Trust Region Management
54//!
55//! The trust region radius Δ adapts based on the gain ratio:
56//!
57//! ```text
58//! ρ = (actual reduction) / (predicted reduction)
59//! ```
60//!
61//! **Good step** (`ρ > 0.75`): Increase radius `Δ ← max(Δ, 3·||h||)`
62//! **Poor step** (`ρ < 0.25`): Decrease radius `Δ ← Δ/2`
63//! **Moderate step**: Keep radius unchanged
64//!
65//! # Advanced Features (Ceres Solver Enhancements)
66//!
67//! This implementation includes several improvements from Google's Ceres Solver:
68//!
69//! ## 1. Adaptive μ Regularization
70//!
71//! Instead of solving `J^T·J·h = -J^T·r` directly, we solve:
72//!
73//! ```text
74//! (J^T·J + μI)·h = -J^T·r
75//! ```
76//!
77//! where μ adapts to handle ill-conditioned Hessians:
78//! - **Increases** (×10) when linear solve fails
79//! - **Decreases** (÷5) when steps are accepted
80//! - **Bounded** between `min_mu` (1e-8) and `max_mu` (1.0)
81//!
82//! Default `initial_mu = 1e-4` provides good numerical stability.
83//!
84//! ## 2. Numerically Robust Beta Computation
85//!
86//! When computing β for dog leg interpolation, solves: `||p_c + β·v||² = Δ²`
87//!
88//! Uses two formulas to avoid catastrophic cancellation:
89//! ```text
90//! If b ≤ 0:  β = (-b + √(b²-ac)) / a    (standard formula)
91//! If b > 0:  β = -c / (b + √(b²-ac))    (alternative, avoids cancellation)
92//! ```
93//!
94//! ## 3. Step Reuse Mechanism
95//!
96//! When a step is rejected, caches the Gauss-Newton step, Cauchy point, and gradient.
97//! On the next iteration (with smaller Δ), reuses these cached values instead of
98//! recomputing them. This avoids expensive linear solves when trust region shrinks.
99//!
100//! **Safety limits:**
101//! - Maximum 5 consecutive reuses before forcing fresh computation
102//! - Cache invalidated when steps are accepted (parameters have moved)
103//!
104//! ## 4. Jacobi Scaling (Diagonal Preconditioning)
105//!
106//! Optionally applies column scaling to J before forming J^T·J:
107//! ```text
108//! S_ii = 1 / (1 + ||J_i||)  where J_i is column i
109//! ```
110//!
111//! This creates an **elliptical trust region** instead of spherical, improving
112//! convergence for problems with mixed parameter scales.
113//!
114//! # Mathematical Background
115//!
116//! ## Why Dog Leg Works
117//!
118//! The dog leg path is a cheap approximation to the optimal trust region step
119//! (which would require solving a constrained optimization problem). It:
120//!
121//! 1. Exploits the fact that optimal steps often lie near the 2D subspace
122//!    spanned by the gradient and Gauss-Newton directions
123//! 2. Provides global convergence guarantees (always finds descent direction)
124//! 3. Achieves local quadratic convergence (like Gauss-Newton near solution)
125//! 4. Requires only one linear solve per iteration (same as Gauss-Newton)
126//!
127//! ## Convergence Properties
128//!
129//! - **Global convergence**: Guaranteed descent at each iteration
130//! - **Local quadratic convergence**: Reduces to Gauss-Newton near solution
131//! - **Robustness**: Handles ill-conditioning via trust region + μ regularization
132//! - **Efficiency**: Comparable cost to Gauss-Newton with better reliability
133//!
134//! # When to Use
135//!
136//! Dog Leg is an excellent choice when:
137//! - You want explicit control over step size (via trust region radius)
138//! - The problem may be ill-conditioned
139//! - You need guaranteed descent at each iteration
140//! - Initial guess may be poor but you want reliable convergence
141//!
142//! Compared to alternatives:
143//! - **vs Gauss-Newton**: More robust but similar computational cost
144//! - **vs Levenberg-Marquardt**: Explicit trust region vs implicit damping
145//! - Both Dog Leg and LM are excellent general-purpose choices
146//!
147//! # Examples
148//!
149//! ## Basic usage
150//!
151//! ```no_run
152//! use apex_solver::optimizer::DogLeg;
153//! use apex_solver::core::problem::Problem;
154//! use std::collections::HashMap;
155//!
156//! # fn main() -> Result<(), Box<dyn std::error::Error>> {
157//! let mut problem = Problem::new();
158//! // ... add residual blocks (factors) to problem ...
159//!
160//! let initial_values = HashMap::new();
161//! // ... initialize parameters ...
162//!
163//! let mut solver = DogLeg::new();
164//! let result = solver.optimize(&problem, &initial_values)?;
165//! # Ok(())
166//! # }
167//! ```
168//!
169//! ## Advanced configuration with Ceres enhancements
170//!
171//! ```no_run
172//! use apex_solver::optimizer::dog_leg::{DogLegConfig, DogLeg};
173//! use apex_solver::linalg::LinearSolverType;
174//!
175//! # fn main() {
176//! let config = DogLegConfig::new()
177//!     .with_max_iterations(100)
178//!     .with_trust_region_radius(1e4)  // Large initial radius
179//!     .with_trust_region_bounds(1e-3, 1e6)  // Min/max radius
180//!     .with_mu_params(1e-4, 1e-8, 1.0, 10.0)  // Conservative regularization
181//!     .with_jacobi_scaling(true)  // Enable elliptical trust regions
182//!     .with_step_reuse(true);  // Enable Ceres-style caching
183//!
184//! let mut solver = DogLeg::with_config(config);
185//! # }
186//! ```
187//!
188//! # References
189//!
190//! - Powell, M. J. D. (1970). "A Hybrid Method for Nonlinear Equations". *Numerical Methods for Nonlinear Algebraic Equations*. Gordon and Breach.
191//! - Nocedal, J. & Wright, S. (2006). *Numerical Optimization* (2nd ed.). Springer. Chapter 4 (Trust Region Methods).
192//! - Madsen, K., Nielsen, H. B., & Tingleff, O. (2004). *Methods for Non-Linear Least Squares Problems* (2nd ed.). Chapter 6.
193//! - Conn, A. R., Gould, N. I. M., & Toint, P. L. (2000). *Trust-Region Methods*. SIAM.
194//! - Ceres Solver: http://ceres-solver.org/ - Google's C++ nonlinear least squares library
195
196use crate::{core::problem, error, linalg, manifold, optimizer};
197
198// Note: Visualization support via observer pattern will be added in a future update
199// For now, use LevenbergMarquardt optimizer for visualization support
200
201use faer::sparse;
202use std::{collections, fmt, time};
203use tracing::debug;
204
205/// Summary statistics for the Dog Leg optimization process.
206#[derive(Debug, Clone)]
207pub struct DogLegSummary {
208    /// Initial cost value
209    pub initial_cost: f64,
210    /// Final cost value
211    pub final_cost: f64,
212    /// Total number of iterations performed
213    pub iterations: usize,
214    /// Number of successful steps (cost decreased)
215    pub successful_steps: usize,
216    /// Number of unsuccessful steps (cost increased, step rejected)
217    pub unsuccessful_steps: usize,
218    /// Final trust region radius
219    pub final_trust_region_radius: f64,
220    /// Average cost reduction per iteration
221    pub average_cost_reduction: f64,
222    /// Maximum gradient norm encountered
223    pub max_gradient_norm: f64,
224    /// Final gradient norm
225    pub final_gradient_norm: f64,
226    /// Maximum parameter update norm
227    pub max_parameter_update_norm: f64,
228    /// Final parameter update norm
229    pub final_parameter_update_norm: f64,
230    /// Total time elapsed
231    pub total_time: time::Duration,
232    /// Average time per iteration
233    pub average_time_per_iteration: time::Duration,
234    /// Detailed per-iteration statistics history
235    pub iteration_history: Vec<IterationStats>,
236    /// Convergence status
237    pub convergence_status: optimizer::OptimizationStatus,
238}
239
240/// Per-iteration statistics for detailed logging (Ceres-style output).
241///
242/// Captures all relevant metrics for each optimization iteration, enabling
243/// detailed analysis and debugging of the optimization process.
244#[derive(Debug, Clone)]
245pub struct IterationStats {
246    /// Iteration number (0-indexed)
247    pub iteration: usize,
248    /// Cost function value at this iteration
249    pub cost: f64,
250    /// Change in cost from previous iteration
251    pub cost_change: f64,
252    /// L2 norm of the gradient (||J^T·r||)
253    pub gradient_norm: f64,
254    /// L2 norm of the parameter update step (||Δx||)
255    pub step_norm: f64,
256    /// Trust region ratio (ρ = actual_reduction / predicted_reduction)
257    pub tr_ratio: f64,
258    /// Trust region radius (Δ)
259    pub tr_radius: f64,
260    /// Linear solver iterations (0 for direct solvers like Cholesky)
261    pub ls_iter: usize,
262    /// Time taken for this iteration in milliseconds
263    pub iter_time_ms: f64,
264    /// Total elapsed time since optimization started in milliseconds
265    pub total_time_ms: f64,
266    /// Whether the step was accepted (true) or rejected (false)
267    pub accepted: bool,
268}
269
270impl IterationStats {
271    /// Print table header in Ceres-style format
272    pub fn print_header() {
273        debug!(
274            "{:>4}  {:>13}  {:>13}  {:>13}  {:>13}  {:>11}  {:>11}  {:>7}  {:>11}  {:>13}  {:>6}",
275            "iter",
276            "cost",
277            "cost_change",
278            "|gradient|",
279            "|step|",
280            "tr_ratio",
281            "tr_radius",
282            "ls_iter",
283            "iter_time",
284            "total_time",
285            "status"
286        );
287    }
288
289    /// Print single iteration line in Ceres-style format with scientific notation
290    pub fn print_line(&self) {
291        let status = if self.iteration == 0 {
292            "-"
293        } else if self.accepted {
294            "✓"
295        } else {
296            "✗"
297        };
298
299        debug!(
300            "{:>4}  {:>13.6e}  {:>13.2e}  {:>13.2e}  {:>13.2e}  {:>11.2e}  {:>11.2e}  {:>7}  {:>9.2}ms  {:>11.2}ms  {:>6}",
301            self.iteration,
302            self.cost,
303            self.cost_change,
304            self.gradient_norm,
305            self.step_norm,
306            self.tr_ratio,
307            self.tr_radius,
308            self.ls_iter,
309            self.iter_time_ms,
310            self.total_time_ms,
311            status
312        );
313    }
314}
315
316impl fmt::Display for DogLegSummary {
317    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
318        // Determine if converged
319        let converged = matches!(
320            self.convergence_status,
321            optimizer::OptimizationStatus::Converged
322                | optimizer::OptimizationStatus::CostToleranceReached
323                | optimizer::OptimizationStatus::GradientToleranceReached
324                | optimizer::OptimizationStatus::ParameterToleranceReached
325        );
326
327        writeln!(f, "Dog-Leg Final Result")?;
328
329        // Title with convergence status
330        if converged {
331            writeln!(f, "CONVERGED ({:?})", self.convergence_status)?;
332        } else {
333            writeln!(f, "DIVERGED ({:?})", self.convergence_status)?;
334        }
335
336        writeln!(f)?;
337        writeln!(f, "Cost:")?;
338        writeln!(f, "  Initial:   {:.6e}", self.initial_cost)?;
339        writeln!(f, "  Final:     {:.6e}", self.final_cost)?;
340        writeln!(
341            f,
342            "  Reduction: {:.6e} ({:.2}%)",
343            self.initial_cost - self.final_cost,
344            100.0 * (self.initial_cost - self.final_cost) / self.initial_cost.max(1e-12)
345        )?;
346        writeln!(f)?;
347        writeln!(f, "Iterations:")?;
348        writeln!(f, "  Total:              {}", self.iterations)?;
349        writeln!(
350            f,
351            "  Successful steps:   {} ({:.1}%)",
352            self.successful_steps,
353            100.0 * self.successful_steps as f64 / self.iterations.max(1) as f64
354        )?;
355        writeln!(
356            f,
357            "  Unsuccessful steps: {} ({:.1}%)",
358            self.unsuccessful_steps,
359            100.0 * self.unsuccessful_steps as f64 / self.iterations.max(1) as f64
360        )?;
361        writeln!(f)?;
362        writeln!(f, "Trust Region:")?;
363        writeln!(f, "  Final radius: {:.6e}", self.final_trust_region_radius)?;
364        writeln!(f)?;
365        writeln!(f, "Gradient:")?;
366        writeln!(f, "  Max norm:   {:.2e}", self.max_gradient_norm)?;
367        writeln!(f, "  Final norm: {:.2e}", self.final_gradient_norm)?;
368        writeln!(f)?;
369        writeln!(f, "Parameter Update:")?;
370        writeln!(f, "  Max norm:   {:.2e}", self.max_parameter_update_norm)?;
371        writeln!(f, "  Final norm: {:.2e}", self.final_parameter_update_norm)?;
372        writeln!(f)?;
373        writeln!(f, "Performance:")?;
374        writeln!(
375            f,
376            "  Total time:             {:.2}ms",
377            self.total_time.as_secs_f64() * 1000.0
378        )?;
379        writeln!(
380            f,
381            "  Average per iteration:  {:.2}ms",
382            self.average_time_per_iteration.as_secs_f64() * 1000.0
383        )?;
384
385        Ok(())
386    }
387}
388
389/// Configuration parameters for the Dog Leg trust region optimizer.
390///
391/// Controls trust region management, convergence criteria, adaptive regularization,
392/// and Ceres Solver enhancements for the Dog Leg algorithm.
393///
394/// # Builder Pattern
395///
396/// All configuration options can be set using the builder pattern:
397///
398/// ```
399/// use apex_solver::optimizer::dog_leg::DogLegConfig;
400///
401/// let config = DogLegConfig::new()
402///     .with_max_iterations(100)
403///     .with_trust_region_radius(1e4)
404///     .with_mu_params(1e-4, 1e-8, 1.0, 10.0)
405///     .with_jacobi_scaling(true)
406///     .with_step_reuse(true);
407/// ```
408///
409/// # Trust Region Behavior
410///
411/// The trust region radius Δ controls the maximum allowed step size:
412///
413/// - **Initial radius** (`trust_region_radius`): Starting value (default: 1e4)
414/// - **Bounds** (`trust_region_min`, `trust_region_max`): Valid range (default: 1e-3 to 1e6)
415/// - **Adaptation**: Increases for good steps, decreases for poor steps
416///
417/// # Adaptive μ Regularization (Ceres Enhancement)
418///
419/// Controls the regularization parameter in `(J^T·J + μI)·h = -J^T·r`:
420///
421/// - `initial_mu`: Starting value (default: 1e-4 for numerical stability)
422/// - `min_mu`, `max_mu`: Bounds (default: 1e-8 to 1.0)
423/// - `mu_increase_factor`: Multiplier when solve fails (default: 10.0)
424///
425/// # Convergence Criteria
426///
427/// The optimizer terminates when ANY of the following conditions is met:
428///
429/// - **Cost tolerance**: `|cost_k - cost_{k-1}| < cost_tolerance`
430/// - **Parameter tolerance**: `||step|| < parameter_tolerance`
431/// - **Gradient tolerance**: `||J^T·r|| < gradient_tolerance`
432/// - **Maximum iterations**: `iteration >= max_iterations`
433/// - **Timeout**: `elapsed_time >= timeout`
434///
435/// # See Also
436///
437/// - [`DogLeg`] - The solver that uses this configuration
438/// - [`LevenbergMarquardtConfig`](crate::optimizer::LevenbergMarquardtConfig) - Alternative damping approach
439/// - [`GaussNewtonConfig`](crate::optimizer::GaussNewtonConfig) - Undamped variant
440#[derive(Clone)]
441pub struct DogLegConfig {
442    /// Type of linear solver for the linear systems
443    pub linear_solver_type: linalg::LinearSolverType,
444    /// Maximum number of iterations
445    pub max_iterations: usize,
446    /// Convergence tolerance for cost function
447    pub cost_tolerance: f64,
448    /// Convergence tolerance for parameter updates
449    pub parameter_tolerance: f64,
450    /// Convergence tolerance for gradient norm
451    pub gradient_tolerance: f64,
452    /// Timeout duration
453    pub timeout: Option<time::Duration>,
454    /// Initial trust region radius
455    pub trust_region_radius: f64,
456    /// Minimum trust region radius
457    pub trust_region_min: f64,
458    /// Maximum trust region radius
459    pub trust_region_max: f64,
460    /// Trust region increase factor (for good steps, rho > 0.75)
461    pub trust_region_increase_factor: f64,
462    /// Trust region decrease factor (for poor steps, rho < 0.25)
463    pub trust_region_decrease_factor: f64,
464    /// Minimum step quality for acceptance (typically 0.0)
465    pub min_step_quality: f64,
466    /// Good step quality threshold (typically 0.75)
467    pub good_step_quality: f64,
468    /// Poor step quality threshold (typically 0.25)
469    pub poor_step_quality: f64,
470    /// Use Jacobi column scaling (preconditioning)
471    pub use_jacobi_scaling: bool,
472
473    // Ceres-style adaptive mu regularization parameters
474    /// Initial mu regularization parameter for Gauss-Newton step
475    pub initial_mu: f64,
476    /// Minimum mu regularization parameter
477    pub min_mu: f64,
478    /// Maximum mu regularization parameter
479    pub max_mu: f64,
480    /// Factor to increase mu when linear solver fails
481    pub mu_increase_factor: f64,
482
483    // Ceres-style step reuse optimization
484    /// Enable step reuse after rejection (Ceres-style efficiency optimization)
485    pub enable_step_reuse: bool,
486
487    /// Minimum objective function cutoff (optional early termination)
488    ///
489    /// If set, optimization terminates when cost falls below this threshold.
490    /// Useful for early stopping when a "good enough" solution is acceptable.
491    ///
492    /// Default: None (disabled)
493    pub min_cost_threshold: Option<f64>,
494
495    /// Maximum condition number for Jacobian matrix (optional check)
496    ///
497    /// If set, the optimizer checks if condition_number(J^T*J) exceeds this
498    /// threshold and terminates with IllConditionedJacobian status.
499    /// Note: Computing condition number is expensive, so this is disabled by default.
500    ///
501    /// Default: None (disabled)
502    pub max_condition_number: Option<f64>,
503
504    /// Minimum relative cost decrease for step acceptance
505    ///
506    /// Used in computing step quality (rho = actual_reduction / predicted_reduction).
507    /// Steps with rho < min_relative_decrease are rejected. Matches Ceres Solver's
508    /// min_relative_decrease parameter.
509    ///
510    /// Default: 1e-3 (Ceres-compatible)
511    pub min_relative_decrease: f64,
512
513    /// Compute per-variable covariance matrices (uncertainty estimation)
514    ///
515    /// When enabled, computes covariance by inverting the Hessian matrix after
516    /// convergence. The full covariance matrix is extracted into per-variable
517    /// blocks stored in both Variable structs and optimizer::SolverResult.
518    ///
519    /// Default: false (to avoid performance overhead)
520    pub compute_covariances: bool,
521
522    /// Enable real-time visualization (graphical debugging).
523    ///
524    /// When enabled, optimization progress is logged to a Rerun viewer.
525    /// **Note:** Requires the `visualization` feature to be enabled in `Cargo.toml`.
526    ///
527    /// Default: false
528    #[cfg(feature = "visualization")]
529    pub enable_visualization: bool,
530}
531
532impl Default for DogLegConfig {
533    fn default() -> Self {
534        Self {
535            linear_solver_type: linalg::LinearSolverType::default(),
536            // Ceres Solver default: 50 (changed from 100 for compatibility)
537            max_iterations: 50,
538            // Ceres Solver default: 1e-6 (changed from 1e-8 for compatibility)
539            cost_tolerance: 1e-6,
540            // Ceres Solver default: 1e-8 (unchanged)
541            parameter_tolerance: 1e-8,
542            // Ceres Solver default: 1e-10 (changed from 1e-8 for compatibility)
543            gradient_tolerance: 1e-10,
544            timeout: None,
545            // Ceres-style: larger initial radius for better global convergence
546            trust_region_radius: 1e4,
547            trust_region_min: 1e-12,
548            trust_region_max: 1e12,
549            // Ceres uses adaptive increase (max(radius, 3*step_norm)),
550            // but we keep factor for simpler config
551            trust_region_increase_factor: 3.0,
552            trust_region_decrease_factor: 0.5,
553            min_step_quality: 0.0,
554            good_step_quality: 0.75,
555            poor_step_quality: 0.25,
556            // Ceres-style: Enable diagonal scaling by default for elliptical trust region
557            use_jacobi_scaling: true,
558
559            // Ceres-style adaptive mu regularization defaults
560            // Start with more conservative regularization to avoid singular Hessian
561            initial_mu: 1e-4,
562            min_mu: 1e-8,
563            max_mu: 1.0,
564            mu_increase_factor: 10.0,
565
566            // Ceres-style step reuse optimization
567            enable_step_reuse: true,
568
569            // New Ceres-compatible termination parameters
570            min_cost_threshold: None,
571            max_condition_number: None,
572            min_relative_decrease: 1e-3,
573
574            compute_covariances: false,
575            #[cfg(feature = "visualization")]
576            enable_visualization: false,
577        }
578    }
579}
580
581impl DogLegConfig {
582    /// Create a new Dog Leg configuration with default values.
583    pub fn new() -> Self {
584        Self::default()
585    }
586
587    /// Set the linear solver type
588    pub fn with_linear_solver_type(mut self, linear_solver_type: linalg::LinearSolverType) -> Self {
589        self.linear_solver_type = linear_solver_type;
590        self
591    }
592
593    /// Set the maximum number of iterations
594    pub fn with_max_iterations(mut self, max_iterations: usize) -> Self {
595        self.max_iterations = max_iterations;
596        self
597    }
598
599    /// Set the cost tolerance
600    pub fn with_cost_tolerance(mut self, cost_tolerance: f64) -> Self {
601        self.cost_tolerance = cost_tolerance;
602        self
603    }
604
605    /// Set the parameter tolerance
606    pub fn with_parameter_tolerance(mut self, parameter_tolerance: f64) -> Self {
607        self.parameter_tolerance = parameter_tolerance;
608        self
609    }
610
611    /// Set the gradient tolerance
612    pub fn with_gradient_tolerance(mut self, gradient_tolerance: f64) -> Self {
613        self.gradient_tolerance = gradient_tolerance;
614        self
615    }
616
617    /// Set the timeout duration
618    pub fn with_timeout(mut self, timeout: time::Duration) -> Self {
619        self.timeout = Some(timeout);
620        self
621    }
622
623    /// Set the initial trust region radius
624    pub fn with_trust_region_radius(mut self, radius: f64) -> Self {
625        self.trust_region_radius = radius;
626        self
627    }
628
629    /// Set the trust region radius bounds
630    pub fn with_trust_region_bounds(mut self, min: f64, max: f64) -> Self {
631        self.trust_region_min = min;
632        self.trust_region_max = max;
633        self
634    }
635
636    /// Set the trust region adjustment factors
637    pub fn with_trust_region_factors(mut self, increase: f64, decrease: f64) -> Self {
638        self.trust_region_increase_factor = increase;
639        self.trust_region_decrease_factor = decrease;
640        self
641    }
642
643    /// Set the trust region quality thresholds
644    pub fn with_step_quality_thresholds(
645        mut self,
646        min_quality: f64,
647        poor_quality: f64,
648        good_quality: f64,
649    ) -> Self {
650        self.min_step_quality = min_quality;
651        self.poor_step_quality = poor_quality;
652        self.good_step_quality = good_quality;
653        self
654    }
655
656    /// Enable or disable Jacobi column scaling (preconditioning)
657    pub fn with_jacobi_scaling(mut self, use_jacobi_scaling: bool) -> Self {
658        self.use_jacobi_scaling = use_jacobi_scaling;
659        self
660    }
661
662    /// Set adaptive mu regularization parameters (Ceres-style)
663    pub fn with_mu_params(
664        mut self,
665        initial_mu: f64,
666        min_mu: f64,
667        max_mu: f64,
668        increase_factor: f64,
669    ) -> Self {
670        self.initial_mu = initial_mu;
671        self.min_mu = min_mu;
672        self.max_mu = max_mu;
673        self.mu_increase_factor = increase_factor;
674        self
675    }
676
677    /// Enable or disable step reuse optimization (Ceres-style)
678    pub fn with_step_reuse(mut self, enable_step_reuse: bool) -> Self {
679        self.enable_step_reuse = enable_step_reuse;
680        self
681    }
682
683    /// Set minimum objective function cutoff for early termination.
684    ///
685    /// When set, optimization terminates with MinCostThresholdReached status
686    /// if the cost falls below this threshold. Useful for early stopping when
687    /// a "good enough" solution is acceptable.
688    pub fn with_min_cost_threshold(mut self, min_cost: f64) -> Self {
689        self.min_cost_threshold = Some(min_cost);
690        self
691    }
692
693    /// Set maximum condition number for Jacobian matrix.
694    ///
695    /// If set, the optimizer checks if condition_number(J^T*J) exceeds this
696    /// threshold and terminates with IllConditionedJacobian status.
697    /// Note: Computing condition number is expensive, disabled by default.
698    pub fn with_max_condition_number(mut self, max_cond: f64) -> Self {
699        self.max_condition_number = Some(max_cond);
700        self
701    }
702
703    /// Set minimum relative cost decrease for step acceptance.
704    ///
705    /// Steps with rho = (actual_reduction / predicted_reduction) below this
706    /// threshold are rejected. Default: 1e-3 (Ceres-compatible)
707    pub fn with_min_relative_decrease(mut self, min_decrease: f64) -> Self {
708        self.min_relative_decrease = min_decrease;
709        self
710    }
711
712    /// Enable or disable covariance computation (uncertainty estimation).
713    ///
714    /// When enabled, computes the full covariance matrix by inverting the Hessian
715    /// after convergence, then extracts per-variable covariance blocks.
716    pub fn with_compute_covariances(mut self, compute_covariances: bool) -> Self {
717        self.compute_covariances = compute_covariances;
718        self
719    }
720
721    /// Enable real-time visualization.
722    ///
723    /// **Note:** Requires the `visualization` feature to be enabled in `Cargo.toml`.
724    ///
725    /// # Arguments
726    ///
727    /// * `enable` - Whether to enable visualization
728    #[cfg(feature = "visualization")]
729    pub fn with_visualization(mut self, enable: bool) -> Self {
730        self.enable_visualization = enable;
731        self
732    }
733
734    /// Print configuration parameters (info level logging)
735    pub fn print_configuration(&self) {
736        debug!(
737            "Configuration:\n  Solver:        Dog-Leg\n  Linear solver: {:?}\n  Loss function: N/A\n\nConvergence Criteria:\n  Max iterations:      {}\n  Cost tolerance:      {:.2e}\n  Parameter tolerance: {:.2e}\n  Gradient tolerance:  {:.2e}\n  Timeout:             {:?}\n\nTrust Region:\n  Initial radius:      {:.2e}\n  Radius range:        [{:.2e}, {:.2e}]\n  Min step quality:    {:.2}\n  Good step quality:   {:.2}\n  Poor step quality:   {:.2}\n\nRegularization:\n  Initial mu:          {:.2e}\n  Mu range:            [{:.2e}, {:.2e}]\n  Mu increase factor:  {:.2}\n\nNumerical Settings:\n  Jacobi scaling:      {}\n  Step reuse:          {}\n  Compute covariances: {}",
738            self.linear_solver_type,
739            self.max_iterations,
740            self.cost_tolerance,
741            self.parameter_tolerance,
742            self.gradient_tolerance,
743            self.timeout,
744            self.trust_region_radius,
745            self.trust_region_min,
746            self.trust_region_max,
747            self.min_step_quality,
748            self.good_step_quality,
749            self.poor_step_quality,
750            self.initial_mu,
751            self.min_mu,
752            self.max_mu,
753            self.mu_increase_factor,
754            if self.use_jacobi_scaling {
755                "enabled"
756            } else {
757                "disabled"
758            },
759            if self.enable_step_reuse {
760                "enabled"
761            } else {
762                "disabled"
763            },
764            if self.compute_covariances {
765                "enabled"
766            } else {
767                "disabled"
768            }
769        );
770    }
771}
772
773/// State for optimization iteration
774struct OptimizationState {
775    variables: collections::HashMap<String, problem::VariableEnum>,
776    variable_index_map: collections::HashMap<String, usize>,
777    sorted_vars: Vec<String>,
778    symbolic_structure: problem::SymbolicStructure,
779    current_cost: f64,
780    initial_cost: f64,
781}
782
783/// Result from step computation
784struct StepResult {
785    step: faer::Mat<f64>,
786    gradient_norm: f64,
787    predicted_reduction: f64,
788}
789
790/// Type of step taken
791#[derive(Debug, Clone, Copy)]
792enum StepType {
793    /// Full Gauss-Newton step
794    GaussNewton,
795    /// Scaled steepest descent (Cauchy point)
796    SteepestDescent,
797    /// Dog leg interpolation
798    DogLeg,
799}
800
801impl fmt::Display for StepType {
802    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
803        match self {
804            StepType::GaussNewton => write!(f, "GN"),
805            StepType::SteepestDescent => write!(f, "SD"),
806            StepType::DogLeg => write!(f, "DL"),
807        }
808    }
809}
810
811/// Result from step evaluation
812struct StepEvaluation {
813    accepted: bool,
814    cost_reduction: f64,
815    rho: f64,
816}
817
818/// Dog Leg trust region solver for nonlinear least squares optimization.
819///
820/// Implements Powell's Dog Leg algorithm with Ceres Solver enhancements including
821/// adaptive μ regularization, numerically robust beta computation, and step reuse caching.
822///
823/// # Algorithm
824///
825/// At each iteration k:
826/// 1. Compute residual `r(xₖ)` and Jacobian `J(xₖ)`
827/// 2. Solve for Gauss-Newton step: `(J^T·J + μI)·h_gn = -J^T·r`
828/// 3. Compute steepest descent direction: `-g` where `g = J^T·r`
829/// 4. Compute Cauchy point: `p_c = -α·g` (optimal along steepest descent)
830/// 5. Construct dog leg step based on trust region radius Δ:
831///    - If `||h_gn|| ≤ Δ`: Take full GN step
832///    - Else if `||p_c|| ≥ Δ`: Take scaled SD to boundary
833///    - Else: Interpolate `h = p_c + β·(h_gn - p_c)` where `||h|| = Δ`
834/// 6. Evaluate gain ratio: `ρ = (actual reduction) / (predicted reduction)`
835/// 7. Update trust region radius based on ρ
836/// 8. Accept/reject step and update parameters
837///
838/// # Ceres Solver Enhancements
839///
840/// This implementation includes four major improvements from Google's Ceres Solver:
841///
842/// **1. Adaptive μ Regularization:** Dynamically adjusts regularization parameter
843/// to handle ill-conditioned Hessians (increases on failure, decreases on success).
844///
845/// **2. Numerically Robust Beta:** Uses two formulas for computing dog leg
846/// interpolation parameter β to avoid catastrophic cancellation.
847///
848/// **3. Step Reuse Mechanism:** Caches GN step, Cauchy point, and gradient when
849/// steps are rejected. Limited to 5 consecutive reuses to prevent staleness.
850///
851/// **4. Jacobi Scaling:** Optional diagonal preconditioning creates elliptical
852/// trust regions for better handling of mixed-scale problems.
853///
854/// # Examples
855///
856/// ```no_run
857/// use apex_solver::optimizer::DogLeg;
858/// use apex_solver::core::problem::Problem;
859/// use std::collections::HashMap;
860///
861/// # fn main() -> Result<(), Box<dyn std::error::Error>> {
862/// let mut problem = Problem::new();
863/// // ... add factors to problem ...
864///
865/// let initial_values = HashMap::new();
866/// // ... initialize parameters ...
867///
868/// let mut solver = DogLeg::new();
869/// let result = solver.optimize(&problem, &initial_values)?;
870/// # Ok(())
871/// # }
872/// ```
873///
874/// # See Also
875///
876/// - [`DogLegConfig`] - Configuration options
877/// - [`LevenbergMarquardt`](crate::optimizer::LevenbergMarquardt) - Alternative adaptive damping
878/// - [`GaussNewton`](crate::optimizer::GaussNewton) - Undamped variant
879pub struct DogLeg {
880    config: DogLegConfig,
881    jacobi_scaling: Option<sparse::SparseColMat<usize, f64>>,
882    observers: optimizer::OptObserverVec,
883
884    // Adaptive mu regularization (Ceres-style)
885    mu: f64,
886    min_mu: f64,
887    max_mu: f64,
888    mu_increase_factor: f64,
889
890    // Step reuse mechanism (Ceres-style efficiency optimization)
891    reuse_step_on_rejection: bool,
892    cached_gn_step: Option<faer::Mat<f64>>,
893    cached_cauchy_point: Option<faer::Mat<f64>>,
894    cached_gradient: Option<faer::Mat<f64>>,
895    cached_alpha: Option<f64>,
896    cache_reuse_count: usize, // Track consecutive reuses to prevent staleness
897}
898
899impl Default for DogLeg {
900    fn default() -> Self {
901        Self::new()
902    }
903}
904
905impl DogLeg {
906    /// Create a new Dog Leg solver with default configuration.
907    pub fn new() -> Self {
908        Self::with_config(DogLegConfig::default())
909    }
910
911    /// Create a new Dog Leg solver with the given configuration.
912    pub fn with_config(config: DogLegConfig) -> Self {
913        Self {
914            // Initialize adaptive mu from config
915            mu: config.initial_mu,
916            min_mu: config.min_mu,
917            max_mu: config.max_mu,
918            mu_increase_factor: config.mu_increase_factor,
919
920            // Initialize step reuse mechanism (disabled initially, enabled after first rejection)
921            reuse_step_on_rejection: false,
922            cached_gn_step: None,
923            cached_cauchy_point: None,
924            cached_gradient: None,
925            cached_alpha: None,
926            cache_reuse_count: 0,
927
928            config,
929            jacobi_scaling: None,
930            observers: optimizer::OptObserverVec::new(),
931        }
932    }
933
934    /// Add an observer to the solver.
935    ///
936    /// Observers are notified at each iteration with the current variable values.
937    /// This enables real-time visualization, logging, metrics collection, etc.
938    ///
939    /// # Examples
940    ///
941    /// ```no_run
942    /// use apex_solver::optimizer::DogLeg;
943    /// # use apex_solver::optimizer::OptObserver;
944    /// # use std::collections::HashMap;
945    /// # use apex_solver::core::problem::VariableEnum;
946    ///
947    /// # struct MyObserver;
948    /// # impl OptObserver for MyObserver {
949    /// #     fn on_step(&self, _: &HashMap<String, VariableEnum>, _: usize) {}
950    /// # }
951    /// let mut solver = DogLeg::new();
952    /// solver.add_observer(MyObserver);
953    /// ```
954    pub fn add_observer(&mut self, observer: impl optimizer::OptObserver + 'static) {
955        self.observers.add(observer);
956    }
957
958    /// Create the appropriate linear solver based on configuration
959    fn create_linear_solver(&self) -> Box<dyn linalg::SparseLinearSolver> {
960        match self.config.linear_solver_type {
961            linalg::LinearSolverType::SparseCholesky => {
962                Box::new(linalg::SparseCholeskySolver::new())
963            }
964            linalg::LinearSolverType::SparseQR => Box::new(linalg::SparseQRSolver::new()),
965        }
966    }
967
968    /// Compute the Cauchy point (steepest descent step)
969    /// Returns the optimal step along the negative gradient direction
970    /// Compute Cauchy point and optimal step length for steepest descent
971    ///
972    /// Returns (alpha, cauchy_point) where:
973    /// - alpha: optimal step length α = ||g||² / (g^T H g)
974    /// - cauchy_point: p_c = -α * g (the Cauchy point)
975    ///
976    /// This is the optimal point along the steepest descent direction within
977    /// the quadratic approximation of the objective function.
978    fn compute_cauchy_point_and_alpha(
979        &self,
980        gradient: &faer::Mat<f64>,
981        hessian: &sparse::SparseColMat<usize, f64>,
982    ) -> (f64, faer::Mat<f64>) {
983        // Optimal step size along steepest descent: α = (g^T*g) / (g^T*H*g)
984        let g_norm_sq_mat = gradient.transpose() * gradient;
985        let g_norm_sq = g_norm_sq_mat[(0, 0)];
986
987        let h_g = hessian * gradient;
988        let g_h_g_mat = gradient.transpose() * &h_g;
989        let g_h_g = g_h_g_mat[(0, 0)];
990
991        // Avoid division by zero
992        let alpha = if g_h_g.abs() > 1e-15 {
993            g_norm_sq / g_h_g
994        } else {
995            1.0
996        };
997
998        // Compute Cauchy point: p_c = -α * gradient
999        let mut cauchy_point = faer::Mat::zeros(gradient.nrows(), 1);
1000        for i in 0..gradient.nrows() {
1001            cauchy_point[(i, 0)] = -alpha * gradient[(i, 0)];
1002        }
1003
1004        (alpha, cauchy_point)
1005    }
1006
1007    /// Compute the dog leg step using Powell's Dog Leg method
1008    ///
1009    /// The dog leg path consists of two segments:
1010    /// 1. From origin to Cauchy point (optimal along steepest descent)
1011    /// 2. From Cauchy point to Gauss-Newton step
1012    ///
1013    /// Arguments:
1014    /// - steepest_descent_dir: -gradient (steepest descent direction, not scaled)
1015    /// - cauchy_point: p_c = α * (-gradient), the optimal steepest descent step
1016    /// - h_gn: Gauss-Newton step
1017    /// - delta: Trust region radius
1018    ///
1019    /// Returns: (step, step_type)
1020    fn compute_dog_leg_step(
1021        &self,
1022        steepest_descent_dir: &faer::Mat<f64>,
1023        cauchy_point: &faer::Mat<f64>,
1024        h_gn: &faer::Mat<f64>,
1025        delta: f64,
1026    ) -> (faer::Mat<f64>, StepType) {
1027        let gn_norm = h_gn.norm_l2();
1028        let cauchy_norm = cauchy_point.norm_l2();
1029        let sd_norm = steepest_descent_dir.norm_l2();
1030
1031        // Case 1: Full Gauss-Newton step fits in trust region
1032        if gn_norm <= delta {
1033            return (h_gn.clone(), StepType::GaussNewton);
1034        }
1035
1036        // Case 2: Even Cauchy point is outside trust region
1037        // Scale steepest descent direction to boundary: (delta / ||δ_sd||) * δ_sd
1038        if cauchy_norm >= delta {
1039            let mut scaled_sd = faer::Mat::zeros(steepest_descent_dir.nrows(), 1);
1040            let scale = delta / sd_norm;
1041            for i in 0..steepest_descent_dir.nrows() {
1042                scaled_sd[(i, 0)] = steepest_descent_dir[(i, 0)] * scale;
1043            }
1044            return (scaled_sd, StepType::SteepestDescent);
1045        }
1046
1047        // Case 3: Dog leg interpolation between Cauchy point and GN step
1048        // Use Ceres-style numerically robust formula
1049        //
1050        // Following Ceres solver implementation for numerical stability:
1051        // Compute intersection of trust region boundary with line from Cauchy point to GN step
1052        //
1053        // Let v = δ_gn - p_c
1054        // Solve: ||p_c + β*v||² = delta²
1055        // This gives: a*β² + 2*b*β + c = 0
1056        // where:
1057        //   a = v^T·v = ||v||²
1058        //   b = p_c^T·v
1059        //   c = p_c^T·p_c - delta² = ||p_c||² - delta²
1060
1061        let mut v = faer::Mat::zeros(cauchy_point.nrows(), 1);
1062        for i in 0..cauchy_point.nrows() {
1063            v[(i, 0)] = h_gn[(i, 0)] - cauchy_point[(i, 0)];
1064        }
1065
1066        // Compute coefficients
1067        let v_squared_norm = v.transpose() * &v;
1068        let a = v_squared_norm[(0, 0)];
1069
1070        let pc_dot_v = cauchy_point.transpose() * &v;
1071        let b = pc_dot_v[(0, 0)];
1072
1073        let c = cauchy_norm * cauchy_norm - delta * delta;
1074
1075        // Ceres-style numerically robust beta computation
1076        // Uses two different formulas based on sign of b to avoid catastrophic cancellation
1077        let d_squared = b * b - a * c;
1078
1079        let beta = if d_squared < 0.0 {
1080            // Should not happen geometrically, but handle gracefully
1081            1.0
1082        } else if a.abs() < 1e-15 {
1083            // Degenerate case: v is nearly zero
1084            1.0
1085        } else {
1086            let d = d_squared.sqrt();
1087
1088            // Ceres formula: choose formula based on sign of b to avoid cancellation
1089            // If b <= 0: beta = (-b + d) / a  (standard formula, no cancellation)
1090            // If b > 0:  beta = -c / (b + d)  (alternative formula, avoids cancellation)
1091            if b <= 0.0 { (-b + d) / a } else { -c / (b + d) }
1092        };
1093
1094        // Clamp beta to [0, 1] for safety
1095        let beta = beta.clamp(0.0, 1.0);
1096
1097        // Compute dog leg step: p_dl = p_c + β*(δ_gn - p_c)
1098        let mut dog_leg = faer::Mat::zeros(cauchy_point.nrows(), 1);
1099        for i in 0..cauchy_point.nrows() {
1100            dog_leg[(i, 0)] = cauchy_point[(i, 0)] + beta * v[(i, 0)];
1101        }
1102
1103        (dog_leg, StepType::DogLeg)
1104    }
1105
1106    /// Update trust region radius based on step quality (Ceres-style)
1107    fn update_trust_region(&mut self, rho: f64, step_norm: f64) -> bool {
1108        if rho > self.config.good_step_quality {
1109            // Good step, increase trust region (Ceres-style: max(radius, 3*step_norm))
1110            let new_radius = self.config.trust_region_radius.max(3.0 * step_norm);
1111            self.config.trust_region_radius = new_radius.min(self.config.trust_region_max);
1112
1113            // Decrease mu on successful step (Ceres-style adaptive regularization)
1114            self.mu = (self.mu / (0.5 * self.mu_increase_factor)).max(self.min_mu);
1115
1116            // Clear reuse flag and invalidate cache on acceptance (parameters have moved)
1117            self.reuse_step_on_rejection = false;
1118            self.cached_gn_step = None;
1119            self.cached_cauchy_point = None;
1120            self.cached_gradient = None;
1121            self.cached_alpha = None;
1122            self.cache_reuse_count = 0;
1123
1124            true
1125        } else if rho < self.config.poor_step_quality {
1126            // Poor step, decrease trust region
1127            self.config.trust_region_radius = (self.config.trust_region_radius
1128                * self.config.trust_region_decrease_factor)
1129                .max(self.config.trust_region_min);
1130
1131            // Enable step reuse flag for next iteration (Ceres-style)
1132            self.reuse_step_on_rejection = self.config.enable_step_reuse;
1133
1134            false
1135        } else {
1136            // Moderate step, keep trust region unchanged
1137            // Clear reuse flag and invalidate cache on acceptance (parameters have moved)
1138            self.reuse_step_on_rejection = false;
1139            self.cached_gn_step = None;
1140            self.cached_cauchy_point = None;
1141            self.cached_gradient = None;
1142            self.cached_alpha = None;
1143            self.cache_reuse_count = 0;
1144
1145            true
1146        }
1147    }
1148
1149    /// Compute step quality ratio (actual vs predicted reduction)
1150    fn compute_step_quality(
1151        &self,
1152        current_cost: f64,
1153        new_cost: f64,
1154        predicted_reduction: f64,
1155    ) -> f64 {
1156        let actual_reduction = current_cost - new_cost;
1157        if predicted_reduction.abs() < 1e-15 {
1158            if actual_reduction > 0.0 { 1.0 } else { 0.0 }
1159        } else {
1160            actual_reduction / predicted_reduction
1161        }
1162    }
1163
1164    /// Compute predicted cost reduction from linear model
1165    fn compute_predicted_reduction(
1166        &self,
1167        step: &faer::Mat<f64>,
1168        gradient: &faer::Mat<f64>,
1169        hessian: &sparse::SparseColMat<usize, f64>,
1170    ) -> f64 {
1171        // Dog Leg predicted reduction: -step^T * gradient - 0.5 * step^T * H * step
1172        let linear_term = step.transpose() * gradient;
1173        let hessian_step = hessian * step;
1174        let quadratic_term = step.transpose() * &hessian_step;
1175
1176        -linear_term[(0, 0)] - 0.5 * quadratic_term[(0, 0)]
1177    }
1178
1179    /// Check convergence criteria
1180    /// Check convergence using comprehensive termination criteria.
1181    ///
1182    /// Implements 9 termination criteria following Ceres Solver standards for Dog Leg:
1183    ///
1184    /// 1. **Gradient Norm (First-Order Optimality)**: ||g||∞ ≤ gradient_tolerance
1185    /// 2. **Parameter Change Tolerance**: ||h|| ≤ parameter_tolerance * (||x|| + parameter_tolerance)
1186    /// 3. **Function Value Change Tolerance**: |ΔF| < cost_tolerance * F
1187    /// 4. **Objective Function Cutoff**: F_new < min_cost_threshold (optional)
1188    /// 5. **Trust Region Radius**: radius < trust_region_min
1189    /// 6. **Singular/Ill-Conditioned Jacobian**: Detected during linear solve
1190    /// 7. **Invalid Numerical Values**: NaN or Inf in cost or parameters
1191    /// 8. **Maximum Iterations**: iteration >= max_iterations
1192    /// 9. **Timeout**: elapsed >= timeout
1193    ///
1194    /// # Arguments
1195    ///
1196    /// * `iteration` - Current iteration number
1197    /// * `current_cost` - Cost before applying the step
1198    /// * `new_cost` - Cost after applying the step
1199    /// * `parameter_norm` - L2 norm of current parameter vector ||x||
1200    /// * `parameter_update_norm` - L2 norm of parameter update step ||h||
1201    /// * `gradient_norm` - Infinity norm of gradient ||g||∞
1202    /// * `trust_region_radius` - Current trust region radius
1203    /// * `elapsed` - Elapsed time since optimization start
1204    /// * `step_accepted` - Whether the current step was accepted
1205    ///
1206    /// # Returns
1207    ///
1208    /// `Some(OptimizationStatus)` if any termination criterion is satisfied, `None` otherwise.
1209    #[allow(clippy::too_many_arguments)]
1210    fn check_convergence(
1211        &self,
1212        iteration: usize,
1213        current_cost: f64,
1214        new_cost: f64,
1215        parameter_norm: f64,
1216        parameter_update_norm: f64,
1217        gradient_norm: f64,
1218        trust_region_radius: f64,
1219        elapsed: time::Duration,
1220        step_accepted: bool,
1221    ) -> Option<optimizer::OptimizationStatus> {
1222        // ========================================================================
1223        // CRITICAL SAFETY CHECKS (perform first, before convergence checks)
1224        // ========================================================================
1225
1226        // CRITERION 7: Invalid Numerical Values (NaN/Inf)
1227        // Always check for numerical instability first
1228        if !new_cost.is_finite() || !parameter_update_norm.is_finite() || !gradient_norm.is_finite()
1229        {
1230            return Some(optimizer::OptimizationStatus::InvalidNumericalValues);
1231        }
1232
1233        // CRITERION 9: Timeout
1234        // Check wall-clock time limit
1235        if let Some(timeout) = self.config.timeout
1236            && elapsed >= timeout
1237        {
1238            return Some(optimizer::OptimizationStatus::Timeout);
1239        }
1240
1241        // CRITERION 8: Maximum Iterations
1242        // Check iteration count limit
1243        if iteration >= self.config.max_iterations {
1244            return Some(optimizer::OptimizationStatus::MaxIterationsReached);
1245        }
1246
1247        // ========================================================================
1248        // CONVERGENCE CRITERIA (only check after successful steps)
1249        // ========================================================================
1250
1251        // Only check convergence criteria after accepted steps
1252        // (rejected steps don't indicate convergence)
1253        if !step_accepted {
1254            return None;
1255        }
1256
1257        // CRITERION 1: Gradient Norm (First-Order Optimality)
1258        // Check if gradient infinity norm is below threshold
1259        // This indicates we're at a critical point (local minimum, saddle, or maximum)
1260        if gradient_norm < self.config.gradient_tolerance {
1261            return Some(optimizer::OptimizationStatus::GradientToleranceReached);
1262        }
1263
1264        // Only check parameter and cost criteria after first iteration
1265        if iteration > 0 {
1266            // CRITERION 2: Parameter Change Tolerance (xtol)
1267            // Ceres formula: ||h|| ≤ ε_param * (||x|| + ε_param)
1268            // This is a relative measure that scales with parameter magnitude
1269            let relative_step_tolerance = self.config.parameter_tolerance
1270                * (parameter_norm + self.config.parameter_tolerance);
1271
1272            if parameter_update_norm <= relative_step_tolerance {
1273                return Some(optimizer::OptimizationStatus::ParameterToleranceReached);
1274            }
1275
1276            // CRITERION 3: Function Value Change Tolerance (ftol)
1277            // Ceres formula: |ΔF| < ε_cost * F
1278            // Check relative cost change (not absolute)
1279            let cost_change = (current_cost - new_cost).abs();
1280            let relative_cost_change = cost_change / current_cost.max(1e-10); // Avoid division by zero
1281
1282            if relative_cost_change < self.config.cost_tolerance {
1283                return Some(optimizer::OptimizationStatus::CostToleranceReached);
1284            }
1285        }
1286
1287        // CRITERION 4: Objective Function Cutoff (optional early stopping)
1288        // Useful for "good enough" solutions
1289        if let Some(min_cost) = self.config.min_cost_threshold
1290            && new_cost < min_cost
1291        {
1292            return Some(optimizer::OptimizationStatus::MinCostThresholdReached);
1293        }
1294
1295        // CRITERION 5: Trust Region Radius
1296        // If trust region has collapsed, optimization has converged or problem is ill-conditioned
1297        if trust_region_radius < self.config.trust_region_min {
1298            return Some(optimizer::OptimizationStatus::TrustRegionRadiusTooSmall);
1299        }
1300
1301        // CRITERION 6: Singular/Ill-Conditioned Jacobian
1302        // Note: This is typically detected during the linear solve and handled there
1303        // The max_condition_number check would be expensive to compute here
1304        // If linear solve fails, it returns an error that's converted to NumericalFailure
1305
1306        // No termination criterion satisfied
1307        None
1308    }
1309
1310    /// Compute total parameter vector norm ||x||.
1311    ///
1312    /// Computes the L2 norm of all parameter vectors concatenated together.
1313    /// This is used in the relative parameter tolerance check.
1314    ///
1315    /// # Arguments
1316    ///
1317    /// * `variables` - Map of variable names to their current values
1318    ///
1319    /// # Returns
1320    ///
1321    /// The L2 norm of the concatenated parameter vector
1322    fn compute_parameter_norm(
1323        variables: &collections::HashMap<String, problem::VariableEnum>,
1324    ) -> f64 {
1325        variables
1326            .values()
1327            .map(|v| {
1328                let vec = v.to_vector();
1329                vec.norm_squared()
1330            })
1331            .sum::<f64>()
1332            .sqrt()
1333    }
1334
1335    /// Create Jacobi scaling matrix from Jacobian
1336    fn create_jacobi_scaling(
1337        &self,
1338        jacobian: &sparse::SparseColMat<usize, f64>,
1339    ) -> Result<sparse::SparseColMat<usize, f64>, optimizer::OptimizerError> {
1340        use faer::sparse::Triplet;
1341
1342        let cols = jacobian.ncols();
1343        let jacobi_scaling_vec: Vec<Triplet<usize, usize, f64>> = (0..cols)
1344            .map(|c| {
1345                let col_norm_squared: f64 = jacobian
1346                    .triplet_iter()
1347                    .filter(|t| t.col == c)
1348                    .map(|t| t.val * t.val)
1349                    .sum();
1350                let col_norm = col_norm_squared.sqrt();
1351                let scaling = 1.0 / (1.0 + col_norm);
1352                Triplet::new(c, c, scaling)
1353            })
1354            .collect();
1355
1356        sparse::SparseColMat::try_new_from_triplets(cols, cols, &jacobi_scaling_vec).map_err(|e| {
1357            optimizer::OptimizerError::JacobiScalingCreation(e.to_string()).log_with_source(e)
1358        })
1359    }
1360
1361    /// Initialize optimization state
1362    fn initialize_optimization_state(
1363        &self,
1364        problem: &problem::Problem,
1365        initial_params: &collections::HashMap<
1366            String,
1367            (manifold::ManifoldType, nalgebra::DVector<f64>),
1368        >,
1369    ) -> Result<OptimizationState, error::ApexSolverError> {
1370        let variables = problem.initialize_variables(initial_params);
1371
1372        let mut variable_index_map = collections::HashMap::new();
1373        let mut col_offset = 0;
1374        let mut sorted_vars: Vec<String> = variables.keys().cloned().collect();
1375        sorted_vars.sort();
1376
1377        for var_name in &sorted_vars {
1378            variable_index_map.insert(var_name.clone(), col_offset);
1379            col_offset += variables[var_name].get_size();
1380        }
1381
1382        let symbolic_structure =
1383            problem.build_symbolic_structure(&variables, &variable_index_map, col_offset)?;
1384
1385        // Initial cost evaluation (residual only, no Jacobian needed)
1386        let residual = problem.compute_residual_sparse(&variables)?;
1387
1388        let residual_norm = residual.norm_l2();
1389        let current_cost = residual_norm * residual_norm;
1390        let initial_cost = current_cost;
1391
1392        Ok(OptimizationState {
1393            variables,
1394            variable_index_map,
1395            sorted_vars,
1396            symbolic_structure,
1397            current_cost,
1398            initial_cost,
1399        })
1400    }
1401
1402    /// Process Jacobian by creating and applying Jacobi scaling if enabled
1403    fn process_jacobian(
1404        &mut self,
1405        jacobian: &sparse::SparseColMat<usize, f64>,
1406        iteration: usize,
1407    ) -> Result<sparse::SparseColMat<usize, f64>, optimizer::OptimizerError> {
1408        // Create Jacobi scaling on first iteration if enabled
1409        if iteration == 0 {
1410            let scaling = self.create_jacobi_scaling(jacobian)?;
1411            self.jacobi_scaling = Some(scaling);
1412        }
1413        let scaling = self
1414            .jacobi_scaling
1415            .as_ref()
1416            .ok_or_else(|| optimizer::OptimizerError::JacobiScalingNotInitialized.log())?;
1417        Ok(jacobian * scaling)
1418    }
1419
1420    /// Compute dog leg optimization step
1421    fn compute_optimization_step(
1422        &mut self,
1423        residuals: &faer::Mat<f64>,
1424        scaled_jacobian: &sparse::SparseColMat<usize, f64>,
1425        linear_solver: &mut Box<dyn linalg::SparseLinearSolver>,
1426    ) -> Option<StepResult> {
1427        // Check if we can reuse cached step (Ceres-style optimization)
1428        // Safety limit: prevent excessive reuse that could lead to stale gradient/Hessian
1429        const MAX_CACHE_REUSE: usize = 5;
1430
1431        if self.reuse_step_on_rejection
1432            && self.config.enable_step_reuse
1433            && self.cache_reuse_count < MAX_CACHE_REUSE
1434            && let (Some(cached_gn), Some(cached_cauchy), Some(cached_grad), Some(_cached_a)) = (
1435                &self.cached_gn_step,
1436                &self.cached_cauchy_point,
1437                &self.cached_gradient,
1438                &self.cached_alpha,
1439            )
1440        {
1441            // Increment reuse counter
1442            self.cache_reuse_count += 1;
1443
1444            let gradient_norm = cached_grad.norm_l2();
1445            let mut steepest_descent = faer::Mat::zeros(cached_grad.nrows(), 1);
1446            for i in 0..cached_grad.nrows() {
1447                steepest_descent[(i, 0)] = -cached_grad[(i, 0)];
1448            }
1449
1450            let (scaled_step, _step_type) = self.compute_dog_leg_step(
1451                &steepest_descent,
1452                cached_cauchy,
1453                cached_gn,
1454                self.config.trust_region_radius,
1455            );
1456
1457            let step = if self.config.use_jacobi_scaling {
1458                let scaling = self.jacobi_scaling.as_ref()?;
1459                scaling * &scaled_step
1460            } else {
1461                scaled_step.clone()
1462            };
1463
1464            let hessian = linear_solver.get_hessian()?;
1465            let predicted_reduction =
1466                self.compute_predicted_reduction(&scaled_step, cached_grad, hessian);
1467
1468            return Some(StepResult {
1469                step,
1470                gradient_norm,
1471                predicted_reduction,
1472            });
1473        }
1474
1475        // Not reusing, compute fresh step
1476        // 1. Solve for Gauss-Newton step with adaptive mu regularization (Ceres-style)
1477        let residuals_owned = residuals.as_ref().to_owned();
1478        let mut scaled_gn_step = None;
1479        let mut mu_attempts = 0;
1480
1481        // Try to solve with current mu, increasing if necessary
1482        while mu_attempts < 10 && self.mu <= self.max_mu {
1483            let damping = self.mu;
1484
1485            if let Ok(step) =
1486                linear_solver.solve_augmented_equation(&residuals_owned, scaled_jacobian, damping)
1487            {
1488                scaled_gn_step = Some(step);
1489                break;
1490            }
1491
1492            // Increase mu (Ceres-style)
1493            self.mu = (self.mu * self.mu_increase_factor).min(self.max_mu);
1494            mu_attempts += 1;
1495        }
1496
1497        let scaled_gn_step = scaled_gn_step?;
1498
1499        // 2. Get gradient and Hessian (cached by solve_augmented_equation)
1500        let gradient = linear_solver.get_gradient()?;
1501        let hessian = linear_solver.get_hessian()?;
1502        let gradient_norm = gradient.norm_l2();
1503
1504        // 3. Compute steepest descent direction: δ_sd = -gradient
1505        let mut steepest_descent = faer::Mat::zeros(gradient.nrows(), 1);
1506        for i in 0..gradient.nrows() {
1507            steepest_descent[(i, 0)] = -gradient[(i, 0)];
1508        }
1509
1510        // 4. Compute Cauchy point and optimal step length
1511        let (alpha, cauchy_point) = self.compute_cauchy_point_and_alpha(gradient, hessian);
1512
1513        // 5. Compute dog leg step based on trust region radius
1514        let (scaled_step, _step_type) = self.compute_dog_leg_step(
1515            &steepest_descent,
1516            &cauchy_point,
1517            &scaled_gn_step,
1518            self.config.trust_region_radius,
1519        );
1520
1521        // 6. Apply inverse Jacobi scaling if enabled
1522        let step = if self.config.use_jacobi_scaling {
1523            let scaling = self.jacobi_scaling.as_ref()?;
1524            scaling * &scaled_step
1525        } else {
1526            scaled_step.clone()
1527        };
1528
1529        // 7. Compute predicted reduction
1530        let predicted_reduction = self.compute_predicted_reduction(&scaled_step, gradient, hessian);
1531
1532        // 8. Cache step components for potential reuse (Ceres-style)
1533        self.cached_gn_step = Some(scaled_gn_step.clone());
1534        self.cached_cauchy_point = Some(cauchy_point.clone());
1535        self.cached_gradient = Some(gradient.clone());
1536        self.cached_alpha = Some(alpha);
1537
1538        Some(StepResult {
1539            step,
1540            gradient_norm,
1541            predicted_reduction,
1542        })
1543    }
1544
1545    /// Evaluate and apply step
1546    fn evaluate_and_apply_step(
1547        &mut self,
1548        step_result: &StepResult,
1549        state: &mut OptimizationState,
1550        problem: &problem::Problem,
1551    ) -> error::ApexSolverResult<StepEvaluation> {
1552        // Apply parameter updates
1553        let step_norm = optimizer::apply_parameter_step(
1554            &mut state.variables,
1555            step_result.step.as_ref(),
1556            &state.sorted_vars,
1557        );
1558
1559        // Compute new cost (residual only, no Jacobian needed for step evaluation)
1560        let new_residual = problem.compute_residual_sparse(&state.variables)?;
1561        let new_residual_norm = new_residual.norm_l2();
1562        let new_cost = new_residual_norm * new_residual_norm;
1563
1564        // Compute step quality
1565        let rho = self.compute_step_quality(
1566            state.current_cost,
1567            new_cost,
1568            step_result.predicted_reduction,
1569        );
1570
1571        // Update trust region and decide acceptance
1572        // Filter out numerical noise with small threshold
1573        let accepted = rho > 1e-4;
1574        let _trust_region_updated = self.update_trust_region(rho, step_norm);
1575
1576        let cost_reduction = if accepted {
1577            let reduction = state.current_cost - new_cost;
1578            state.current_cost = new_cost;
1579            reduction
1580        } else {
1581            // Reject step - revert changes
1582            optimizer::apply_negative_parameter_step(
1583                &mut state.variables,
1584                step_result.step.as_ref(),
1585                &state.sorted_vars,
1586            );
1587            0.0
1588        };
1589
1590        Ok(StepEvaluation {
1591            accepted,
1592            cost_reduction,
1593            rho,
1594        })
1595    }
1596
1597    /// Create optimization summary
1598    #[allow(clippy::too_many_arguments)]
1599    fn create_summary(
1600        &self,
1601        initial_cost: f64,
1602        final_cost: f64,
1603        iterations: usize,
1604        successful_steps: usize,
1605        unsuccessful_steps: usize,
1606        max_gradient_norm: f64,
1607        final_gradient_norm: f64,
1608        max_parameter_update_norm: f64,
1609        final_parameter_update_norm: f64,
1610        total_cost_reduction: f64,
1611        total_time: time::Duration,
1612        iteration_history: Vec<IterationStats>,
1613        convergence_status: optimizer::OptimizationStatus,
1614    ) -> DogLegSummary {
1615        DogLegSummary {
1616            initial_cost,
1617            final_cost,
1618            iterations,
1619            successful_steps,
1620            unsuccessful_steps,
1621            final_trust_region_radius: self.config.trust_region_radius,
1622            average_cost_reduction: if iterations > 0 {
1623                total_cost_reduction / iterations as f64
1624            } else {
1625                0.0
1626            },
1627            max_gradient_norm,
1628            final_gradient_norm,
1629            max_parameter_update_norm,
1630            final_parameter_update_norm,
1631            total_time,
1632            average_time_per_iteration: if iterations > 0 {
1633                total_time / iterations as u32
1634            } else {
1635                time::Duration::from_secs(0)
1636            },
1637            iteration_history,
1638            convergence_status,
1639        }
1640    }
1641
1642    /// Minimize the optimization problem using Dog Leg algorithm
1643    pub fn optimize(
1644        &mut self,
1645        problem: &problem::Problem,
1646        initial_params: &collections::HashMap<
1647            String,
1648            (manifold::ManifoldType, nalgebra::DVector<f64>),
1649        >,
1650    ) -> Result<
1651        optimizer::SolverResult<collections::HashMap<String, problem::VariableEnum>>,
1652        error::ApexSolverError,
1653    > {
1654        let start_time = time::Instant::now();
1655        let mut iteration = 0;
1656        let mut cost_evaluations = 1;
1657        let mut jacobian_evaluations = 0;
1658        let mut successful_steps = 0;
1659        let mut unsuccessful_steps = 0;
1660
1661        let mut state = self.initialize_optimization_state(problem, initial_params)?;
1662        let mut linear_solver = self.create_linear_solver();
1663
1664        let mut max_gradient_norm: f64 = 0.0;
1665        let mut max_parameter_update_norm: f64 = 0.0;
1666        let mut total_cost_reduction = 0.0;
1667        let mut final_gradient_norm;
1668        let mut final_parameter_update_norm;
1669
1670        // Initialize iteration statistics tracking
1671        let mut iteration_stats = Vec::with_capacity(self.config.max_iterations);
1672        let mut previous_cost = state.current_cost;
1673
1674        // Print configuration and header if info/debug level is enabled
1675        if tracing::enabled!(tracing::Level::DEBUG) {
1676            self.config.print_configuration();
1677            IterationStats::print_header();
1678        }
1679
1680        loop {
1681            let iter_start = time::Instant::now();
1682            // Evaluate residuals and Jacobian
1683            let (residuals, jacobian) = problem.compute_residual_and_jacobian_sparse(
1684                &state.variables,
1685                &state.variable_index_map,
1686                &state.symbolic_structure,
1687            )?;
1688            jacobian_evaluations += 1;
1689
1690            // Process Jacobian (apply scaling if enabled)
1691            let scaled_jacobian = if self.config.use_jacobi_scaling {
1692                self.process_jacobian(&jacobian, iteration)?
1693            } else {
1694                jacobian
1695            };
1696
1697            // Compute dog leg step
1698            let step_result = match self.compute_optimization_step(
1699                &residuals,
1700                &scaled_jacobian,
1701                &mut linear_solver,
1702            ) {
1703                Some(result) => result,
1704                None => {
1705                    return Err(optimizer::OptimizerError::LinearSolveFailed(
1706                        "Linear solver failed to solve system".to_string(),
1707                    )
1708                    .into());
1709                }
1710            };
1711
1712            // Update tracking
1713            max_gradient_norm = max_gradient_norm.max(step_result.gradient_norm);
1714            final_gradient_norm = step_result.gradient_norm;
1715            let step_norm = step_result.step.norm_l2();
1716            max_parameter_update_norm = max_parameter_update_norm.max(step_norm);
1717            final_parameter_update_norm = step_norm;
1718
1719            // Evaluate and apply step
1720            let step_eval = self.evaluate_and_apply_step(&step_result, &mut state, problem)?;
1721            cost_evaluations += 1;
1722
1723            if step_eval.accepted {
1724                successful_steps += 1;
1725                total_cost_reduction += step_eval.cost_reduction;
1726            } else {
1727                unsuccessful_steps += 1;
1728            }
1729
1730            // OPTIMIZATION: Only collect iteration statistics if debug level is enabled
1731            // This eliminates ~2-5ms overhead per iteration for non-debug optimization
1732            if tracing::enabled!(tracing::Level::DEBUG) {
1733                let iter_elapsed_ms = iter_start.elapsed().as_secs_f64() * 1000.0;
1734                let total_elapsed_ms = start_time.elapsed().as_secs_f64() * 1000.0;
1735
1736                let stats = IterationStats {
1737                    iteration,
1738                    cost: state.current_cost,
1739                    cost_change: previous_cost - state.current_cost,
1740                    gradient_norm: step_result.gradient_norm,
1741                    step_norm,
1742                    tr_ratio: step_eval.rho,
1743                    tr_radius: self.config.trust_region_radius,
1744                    ls_iter: 0, // Direct solver (Cholesky) has no iterations
1745                    iter_time_ms: iter_elapsed_ms,
1746                    total_time_ms: total_elapsed_ms,
1747                    accepted: step_eval.accepted,
1748                };
1749
1750                iteration_stats.push(stats.clone());
1751                stats.print_line();
1752            }
1753
1754            previous_cost = state.current_cost;
1755
1756            // Notify all observers with current state
1757            // First set metrics data, then notify observers
1758            self.observers.set_iteration_metrics(
1759                state.current_cost,
1760                step_result.gradient_norm,
1761                Some(self.config.trust_region_radius), // Dog Leg uses trust region radius
1762                step_norm,
1763                Some(step_eval.rho),
1764            );
1765
1766            // Set matrix data if available and there are observers
1767            if !self.observers.is_empty()
1768                && let (Some(hessian), Some(gradient)) =
1769                    (linear_solver.get_hessian(), linear_solver.get_gradient())
1770            {
1771                self.observers
1772                    .set_matrix_data(Some(hessian.clone()), Some(gradient.clone()));
1773            }
1774
1775            // Notify observers with current variable values and iteration number
1776            self.observers.notify(&state.variables, iteration);
1777
1778            // Check convergence
1779            let elapsed = start_time.elapsed();
1780
1781            // Compute parameter norm for relative parameter tolerance check
1782            let parameter_norm = Self::compute_parameter_norm(&state.variables);
1783
1784            // Compute costs for convergence check
1785            let new_cost = state.current_cost;
1786
1787            let cost_before_step = if step_eval.accepted {
1788                state.current_cost + step_eval.cost_reduction
1789            } else {
1790                state.current_cost
1791            };
1792
1793            if let Some(status) = self.check_convergence(
1794                iteration,
1795                cost_before_step,
1796                new_cost,
1797                parameter_norm,
1798                step_norm,
1799                step_result.gradient_norm,
1800                self.config.trust_region_radius,
1801                elapsed,
1802                step_eval.accepted,
1803            ) {
1804                let summary = self.create_summary(
1805                    state.initial_cost,
1806                    state.current_cost,
1807                    iteration + 1,
1808                    successful_steps,
1809                    unsuccessful_steps,
1810                    max_gradient_norm,
1811                    final_gradient_norm,
1812                    max_parameter_update_norm,
1813                    final_parameter_update_norm,
1814                    total_cost_reduction,
1815                    elapsed,
1816                    iteration_stats.clone(),
1817                    status.clone(),
1818                );
1819
1820                // Print summary only if debug level is enabled
1821                if tracing::enabled!(tracing::Level::DEBUG) {
1822                    debug!("{}", summary);
1823                }
1824
1825                // Compute covariances if enabled
1826                let covariances = if self.config.compute_covariances {
1827                    problem.compute_and_set_covariances(
1828                        &mut linear_solver,
1829                        &mut state.variables,
1830                        &state.variable_index_map,
1831                    )
1832                } else {
1833                    None
1834                };
1835
1836                return Ok(optimizer::SolverResult {
1837                    status,
1838                    iterations: iteration + 1,
1839                    initial_cost: state.initial_cost,
1840                    final_cost: state.current_cost,
1841                    parameters: state.variables.into_iter().collect(),
1842                    elapsed_time: elapsed,
1843                    convergence_info: Some(optimizer::ConvergenceInfo {
1844                        final_gradient_norm,
1845                        final_parameter_update_norm,
1846                        cost_evaluations,
1847                        jacobian_evaluations,
1848                    }),
1849                    covariances,
1850                });
1851            }
1852
1853            // Note: Max iterations and timeout checks are now handled inside check_convergence()
1854
1855            iteration += 1;
1856        }
1857    }
1858}
1859
1860impl optimizer::Solver for DogLeg {
1861    type Config = DogLegConfig;
1862    type Error = error::ApexSolverError;
1863
1864    fn new() -> Self {
1865        Self::default()
1866    }
1867
1868    fn optimize(
1869        &mut self,
1870        problem: &problem::Problem,
1871        initial_params: &collections::HashMap<
1872            String,
1873            (manifold::ManifoldType, nalgebra::DVector<f64>),
1874        >,
1875    ) -> Result<
1876        optimizer::SolverResult<collections::HashMap<String, problem::VariableEnum>>,
1877        Self::Error,
1878    > {
1879        self.optimize(problem, initial_params)
1880    }
1881}
1882
1883#[cfg(test)]
1884mod tests {
1885    use super::*;
1886    use crate::{factors, manifold};
1887    use nalgebra;
1888
1889    /// Custom Rosenbrock Factor 1: r1 = 10(x2 - x1²)
1890    /// Demonstrates extensibility - custom factors can be defined outside of factors.rs
1891    #[derive(Debug, Clone)]
1892    struct RosenbrockFactor1;
1893
1894    impl factors::Factor for RosenbrockFactor1 {
1895        fn linearize(
1896            &self,
1897            params: &[nalgebra::DVector<f64>],
1898            compute_jacobian: bool,
1899        ) -> (nalgebra::DVector<f64>, Option<nalgebra::DMatrix<f64>>) {
1900            let x1 = params[0][0];
1901            let x2 = params[1][0];
1902
1903            // Residual: r1 = 10(x2 - x1²)
1904            let residual = nalgebra::dvector![10.0 * (x2 - x1 * x1)];
1905
1906            // Jacobian: ∂r1/∂x1 = -20*x1, ∂r1/∂x2 = 10
1907            let jacobian = if compute_jacobian {
1908                let mut jac = nalgebra::DMatrix::zeros(1, 2);
1909                jac[(0, 0)] = -20.0 * x1;
1910                jac[(0, 1)] = 10.0;
1911                Some(jac)
1912            } else {
1913                None
1914            };
1915
1916            (residual, jacobian)
1917        }
1918
1919        fn get_dimension(&self) -> usize {
1920            1
1921        }
1922    }
1923
1924    /// Custom Rosenbrock Factor 2: r2 = 1 - x1
1925    /// Demonstrates extensibility - custom factors can be defined outside of factors.rs
1926    #[derive(Debug, Clone)]
1927    struct RosenbrockFactor2;
1928
1929    impl factors::Factor for RosenbrockFactor2 {
1930        fn linearize(
1931            &self,
1932            params: &[nalgebra::DVector<f64>],
1933            compute_jacobian: bool,
1934        ) -> (nalgebra::DVector<f64>, Option<nalgebra::DMatrix<f64>>) {
1935            let x1 = params[0][0];
1936
1937            // Residual: r2 = 1 - x1
1938            let residual = nalgebra::dvector![1.0 - x1];
1939
1940            // Jacobian: ∂r2/∂x1 = -1
1941            let jacobian = if compute_jacobian {
1942                Some(nalgebra::DMatrix::from_element(1, 1, -1.0))
1943            } else {
1944                None
1945            };
1946
1947            (residual, jacobian)
1948        }
1949
1950        fn get_dimension(&self) -> usize {
1951            1
1952        }
1953    }
1954
1955    #[test]
1956    fn test_rosenbrock_optimization() -> Result<(), Box<dyn std::error::Error>> {
1957        // Rosenbrock function test:
1958        // Minimize: r1² + r2² where
1959        //   r1 = 10(x2 - x1²)
1960        //   r2 = 1 - x1
1961        // Starting point: [-1.2, 1.0]
1962        // Expected minimum: [1.0, 1.0]
1963
1964        let mut problem = problem::Problem::new();
1965        let mut initial_values = collections::HashMap::new();
1966
1967        // Add variables using Rn manifold (Euclidean space)
1968        initial_values.insert(
1969            "x1".to_string(),
1970            (manifold::ManifoldType::RN, nalgebra::dvector![-1.2]),
1971        );
1972        initial_values.insert(
1973            "x2".to_string(),
1974            (manifold::ManifoldType::RN, nalgebra::dvector![1.0]),
1975        );
1976
1977        // Add custom factors (demonstrates extensibility!)
1978        problem.add_residual_block(&["x1", "x2"], Box::new(RosenbrockFactor1), None);
1979        problem.add_residual_block(&["x1"], Box::new(RosenbrockFactor2), None);
1980
1981        // Configure Dog Leg optimizer with appropriate trust region
1982        let config = DogLegConfig::new()
1983            .with_max_iterations(100)
1984            .with_cost_tolerance(1e-8)
1985            .with_parameter_tolerance(1e-8)
1986            .with_gradient_tolerance(1e-10)
1987            .with_trust_region_radius(10.0); // Start with larger trust region
1988
1989        let mut solver = DogLeg::with_config(config);
1990        let result = solver.optimize(&problem, &initial_values)?;
1991
1992        // Extract final values
1993        let x1_final = result
1994            .parameters
1995            .get("x1")
1996            .ok_or("x1 not found")?
1997            .to_vector()[0];
1998        let x2_final = result
1999            .parameters
2000            .get("x2")
2001            .ok_or("x2 not found")?
2002            .to_vector()[0];
2003
2004        // Verify convergence to [1.0, 1.0]
2005        assert!(
2006            matches!(
2007                result.status,
2008                optimizer::OptimizationStatus::Converged
2009                    | optimizer::OptimizationStatus::CostToleranceReached
2010                    | optimizer::OptimizationStatus::ParameterToleranceReached
2011                    | optimizer::OptimizationStatus::GradientToleranceReached
2012            ),
2013            "Optimization should converge"
2014        );
2015        assert!(
2016            (x1_final - 1.0).abs() < 1e-4,
2017            "x1 should converge to 1.0, got {}",
2018            x1_final
2019        );
2020        assert!(
2021            (x2_final - 1.0).abs() < 1e-4,
2022            "x2 should converge to 1.0, got {}",
2023            x2_final
2024        );
2025        assert!(
2026            result.final_cost < 1e-6,
2027            "Final cost should be near zero, got {}",
2028            result.final_cost
2029        );
2030        Ok(())
2031    }
2032}