Skip to main content

ferrolearn_decomp/
sparse_pca.rs

1//! Sparse Principal Component Analysis (SparsePCA).
2//!
3//! SparsePCA finds sparse components that can optimally reconstruct the data
4//! by combining PCA with L1 (lasso) penalisation on the loadings. This
5//! produces components that are easier to interpret, at the cost of
6//! explained variance compared to standard PCA.
7//!
8//! # Algorithm
9//!
10//! Uses an Elastic-Net / Coordinate-Descent approach:
11//!
12//! 1. Initialise dictionary `V` from PCA (or random).
13//! 2. Alternate:
14//!    a. Fix `V`, solve for sparse code `U` via coordinate descent:
15//!    `min ||X - U V^T||^2 + alpha * ||U||_1` (per row of `U`).
16//!    b. Fix `U`, update `V = X^T U (U^T U)^{-1}`, then normalise columns.
17//! 3. The rows of `V` are the sparse principal components.
18//!
19//! # Examples
20//!
21//! ```
22//! use ferrolearn_decomp::SparsePCA;
23//! use ferrolearn_core::traits::{Fit, Transform};
24//! use ndarray::array;
25//!
26//! let spca = SparsePCA::<f64>::new(1);
27//! let x = array![
28//!     [1.0, 2.0, 3.0],
29//!     [4.0, 5.0, 6.0],
30//!     [7.0, 8.0, 9.0],
31//!     [10.0, 11.0, 12.0],
32//! ];
33//! let fitted = spca.fit(&x, &()).unwrap();
34//! let projected = fitted.transform(&x).unwrap();
35//! assert_eq!(projected.ncols(), 1);
36//! ```
37//!
38//! ## REQ status
39//!
40//! Design: `.design/decomp/sparse_pca.md`. Tracking: #1476. Each REQ is BINARY —
41//! SHIPPED (impl + non-test consumer + tests + green verification) or NOT-STARTED
42//! (concrete open blocker). Non-test consumers: crate re-export (`lib.rs:99`) and
43//! the `_RsSparsePCA` PyO3 binding (`extras.rs:1129`, registered `lib.rs:77`).
44//! Oracle = live sklearn 1.5.2 (`_sparse_pca.py`), run from `/tmp` (R-CHAR-3).
45//!
46//! | REQ | Scope | Status | Evidence / Blocker |
47//! |---|---|---|---|
48//! | REQ-1 | `transform` = RIDGE regression (`ridge_alpha=0.01`), not plain projection | SHIPPED | `transform` solves `U = (X−mean_)·Cᵀ·(C·Cᵀ + 0.01·I)⁻¹` == `ridge_regression(components_.T, X_centered.T, 0.01, solver="cholesky")` (sklearn `_sparse_pca.py:119-121`); matches the sklearn ridge oracle on ferrolearn's own fitted components to 1e-6 in `tests/divergence_sparse_pca.rs::divergence_transform_is_ridge_not_projection` (was #1477, fixed). Consumers: re-export `lib.rs:99`, `_RsSparsePCA` `extras.rs:1129` |
49//! | REQ-2 | Structural: components sparsity (L1→exact zeros), shape `(n_components,n_features)`, mean centering, error-decrease, determinism | SHIPPED (scoped) | `fn fit` centers via per-feature `mean` (= `mean_ = X.mean(axis=0)` `_sparse_pca.py:83`), alternating soft-threshold sparse-coding + dictionary update, seeded `StdRng` ⇒ deterministic. Green-guards in `tests/divergence_sparse_pca.rs` + in-module tests. STRUCTURAL only, NOT component VALUES (REQ-4) |
50//! | REQ-3 | Error / parameter contracts (n_components 0/>n_features, <2 samples, transform col mismatch, NON-FINITE rejection) | SHIPPED (scoped) | `fn fit` guards (`InvalidParameter`/`InsufficientSamples`); `transform` `ShapeMismatch`. NON-FINITE: `fit`+`transform` call `reject_non_finite` (`sparse_pca.rs` symbol `reject_non_finite`) BEFORE the sparse-coding/ridge math, returning the CLEAN finiteness `InvalidParameter{name:"X", reason:"Input X contains NaN or infinity."}` = sklearn `_validate_data(force_all_finite=True)` (`_sparse_pca.py:81`,`:116`,`utils/validation.py:147-154`) — the real input gate (the `is_finite` at `:641-642` is TEST-only). `tests/divergence_nonfinite_spillover.rs::divergence_sparse_pca_fit_nan`/`_transform_nan` match the live sklearn 1.5.2 oracle (#2290). FLAG: sklearn raises `InvalidParameterError`, accepts `n_components=None`, does not pre-reject `n_components>n_features`/`n_samples<2` |
51//! | REQ-4 | EXACT `components_` value parity with sklearn `dict_learning` | NOT-STARTED | CARVE-OUT (R-DEFER-3): LARS lasso + numpy RNG + `svd_flip` + per-feature alpha scaling (`_sparse_pca.py:308-336`, `_dict_learning.py:120`) vs ferrolearn random-init soft-threshold-CD + LS update — blocker #1478 |
52//! | REQ-5 | `ridge_alpha` ctor/builder param exposure | NOT-STARTED | sklearn `SparsePCA(ridge_alpha=0.01)` `_sparse_pca.py:284`; ferrolearn hard-codes 0.01 in `transform` (REQ-1) — blocker #1479 |
53//! | REQ-6 | `method` (`lars`/`cd`) field + LARS sparse-coding path | NOT-STARTED | sklearn `method="lars"` → `dict_learning(method=)` `_sparse_pca.py:287/:319`; ferrolearn single soft-threshold CD coder — blocker #1480 |
54//! | REQ-7 | `U_init`/`V_init` warm restart + alpha/n_features scaling | NOT-STARTED | sklearn `_sparse_pca.py:289-290`, `_dict_learning.py:120`; ferrolearn always random-inits `V`, un-scaled alpha — blocker #1481 |
55//! | REQ-8 | `MiniBatchSparsePCA` online variant | NOT-STARTED | sklearn `class MiniBatchSparsePCA(_BaseSparsePCA)` `_sparse_pca.py:339-552`; ABSENT in ferrolearn — blocker #1482 |
56//! | REQ-9 | `inverse_transform` + `error_`/`n_components_`/`n_features_in_` attrs | NOT-STARTED | sklearn `_sparse_pca.py:125-146/:223/:226/:238`; `FittedSparsePCA` exposes only `components()`/`mean()`/`n_iter()` — blocker #1483 |
57//! | REQ-10 | PyO3 binding surface (thin: `n_components` ctor + `fit` + `transform`) | SHIPPED (scoped) | `_RsSparsePCA` (`extras.rs:1129`, registered `lib.rs:77`) via `py_transformer!`; inherits the REQ-1 ridge transform. NOT `alpha`/`tol`/`random_state` params, NOT `components_` accessor, NOT `inverse_transform` (REQ-5/9) |
58//! | REQ-11 | ferray substrate | NOT-STARTED | dense `ndarray` + hand-rolled `invert_small_symmetric` — blocker #1484 |
59//!
60//! Count: **4 SHIPPED (REQ-1,2,3,10) / 7 NOT-STARTED (REQ-4,5,6,7,8,9,11)**.
61
62use ferrolearn_core::error::FerroError;
63use ferrolearn_core::traits::{Fit, Transform};
64use ndarray::{Array1, Array2};
65use num_traits::Float;
66use rand::SeedableRng;
67use rand_distr::{Distribution, Uniform};
68
69/// Reject non-finite input the way sklearn's `_validate_data` does.
70///
71/// sklearn runs `check_array` with the default `force_all_finite=True` at the
72/// top of `SparsePCA.fit`/`transform` (`sklearn/decomposition/_sparse_pca.py:81`,
73/// `:116`), raising `ValueError("Input X contains NaN.")` /
74/// `"... contains infinity ..."` (`sklearn/utils/validation.py:147-154`) BEFORE
75/// any decomposition / ridge math. NaN AND infinity are both rejected. The
76/// message names "NaN" and "infinity" to mirror sklearn's `ValueError`. Never
77/// panics (R-CODE-2). (The `is_finite` checks in `#[cfg(test)]` below are
78/// test-only and do NOT gate input — this guard is the real input gate.)
79fn reject_non_finite<F: Float>(x: &Array2<F>) -> Result<(), FerroError> {
80    if x.iter().any(|v| !v.is_finite()) {
81        return Err(FerroError::InvalidParameter {
82            name: "X".into(),
83            reason: "Input X contains NaN or infinity.".into(),
84        });
85    }
86    Ok(())
87}
88
89// ---------------------------------------------------------------------------
90// SparsePCA (unfitted)
91// ---------------------------------------------------------------------------
92
93/// Sparse PCA configuration.
94///
95/// Holds hyperparameters for the Sparse PCA decomposition. Calling
96/// [`Fit::fit`] performs the iterative elastic-net / coordinate-descent
97/// procedure and returns a [`FittedSparsePCA`] that can project new data.
98#[derive(Debug, Clone)]
99pub struct SparsePCA<F> {
100    /// Number of sparse components to extract.
101    n_components: usize,
102    /// Sparsity penalty weight on the L1 norm of the loadings.
103    alpha: f64,
104    /// Maximum number of outer iterations.
105    max_iter: usize,
106    /// Convergence tolerance on the relative change in reconstruction error.
107    tol: f64,
108    /// Optional random seed for reproducibility.
109    random_state: Option<u64>,
110    _marker: std::marker::PhantomData<F>,
111}
112
113impl<F: Float + Send + Sync + 'static> SparsePCA<F> {
114    /// Create a new `SparsePCA` that extracts `n_components` sparse components.
115    ///
116    /// Defaults: `alpha = 1.0`, `max_iter = 1000`, `tol = 1e-8`,
117    /// `random_state = None`.
118    #[must_use]
119    pub fn new(n_components: usize) -> Self {
120        Self {
121            n_components,
122            alpha: 1.0,
123            max_iter: 1000,
124            tol: 1e-8,
125            random_state: None,
126            _marker: std::marker::PhantomData,
127        }
128    }
129
130    /// Set the sparsity penalty weight (L1 regularisation on codes).
131    #[must_use]
132    pub fn with_alpha(mut self, alpha: f64) -> Self {
133        self.alpha = alpha;
134        self
135    }
136
137    /// Set the maximum number of outer iterations.
138    #[must_use]
139    pub fn with_max_iter(mut self, max_iter: usize) -> Self {
140        self.max_iter = max_iter;
141        self
142    }
143
144    /// Set the convergence tolerance.
145    #[must_use]
146    pub fn with_tol(mut self, tol: f64) -> Self {
147        self.tol = tol;
148        self
149    }
150
151    /// Set the random seed for reproducible results.
152    #[must_use]
153    pub fn with_random_state(mut self, seed: u64) -> Self {
154        self.random_state = Some(seed);
155        self
156    }
157
158    /// Return the configured number of components.
159    #[must_use]
160    pub fn n_components(&self) -> usize {
161        self.n_components
162    }
163
164    /// Return the configured sparsity penalty.
165    #[must_use]
166    pub fn alpha(&self) -> f64 {
167        self.alpha
168    }
169
170    /// Return the configured maximum iterations.
171    #[must_use]
172    pub fn max_iter(&self) -> usize {
173        self.max_iter
174    }
175
176    /// Return the configured tolerance.
177    #[must_use]
178    pub fn tol(&self) -> f64 {
179        self.tol
180    }
181}
182
183// ---------------------------------------------------------------------------
184// FittedSparsePCA
185// ---------------------------------------------------------------------------
186
187/// A fitted Sparse PCA model holding the learned components.
188///
189/// Created by calling [`Fit::fit`] on a [`SparsePCA`]. Implements
190/// [`Transform<Array2<F>>`] to project new data onto the sparse components.
191#[derive(Debug, Clone)]
192pub struct FittedSparsePCA<F> {
193    /// Sparse components, shape `(n_components, n_features)`.
194    components_: Array2<F>,
195    /// Per-feature mean computed during fitting (used for centring).
196    mean_: Array1<F>,
197    /// Number of outer iterations performed.
198    n_iter_: usize,
199    /// Ridge shrinkage applied during `transform` (sklearn default `0.01`).
200    ridge_alpha: f64,
201}
202
203impl<F: Float + Send + Sync + 'static> FittedSparsePCA<F> {
204    /// Sparse components, shape `(n_components, n_features)`.
205    #[must_use]
206    pub fn components(&self) -> &Array2<F> {
207        &self.components_
208    }
209
210    /// Per-feature mean learned during fitting.
211    #[must_use]
212    pub fn mean(&self) -> &Array1<F> {
213        &self.mean_
214    }
215
216    /// Number of outer iterations performed.
217    #[must_use]
218    pub fn n_iter(&self) -> usize {
219        self.n_iter_
220    }
221}
222
223// ---------------------------------------------------------------------------
224// Internal helpers
225// ---------------------------------------------------------------------------
226
227/// Small epsilon to prevent division by zero.
228#[inline]
229fn eps<F: Float>() -> F {
230    F::from(1e-12).unwrap_or_else(F::epsilon)
231}
232
233/// Soft-thresholding operator: sign(x) * max(|x| - threshold, 0).
234#[inline]
235fn soft_threshold<F: Float>(x: F, threshold: F) -> F {
236    if x > threshold {
237        x - threshold
238    } else if x < -threshold {
239        x + threshold
240    } else {
241        F::zero()
242    }
243}
244
245/// Solve sparse coding for a single row of U via coordinate descent:
246///   min_u  ||x_row - u V^T||^2 + alpha * ||u||_1
247///
248/// `v` has shape `(n_components, n_features)`.
249fn sparse_code_row<F: Float>(
250    x_row: &[F],
251    v: &Array2<F>,
252    alpha_f: F,
253    u_row: &mut [F],
254    n_cd_iters: usize,
255) {
256    let n_components = v.nrows();
257    let n_features = v.ncols();
258
259    for _iter in 0..n_cd_iters {
260        for k in 0..n_components {
261            // Compute residual excluding component k.
262            let mut residual_dot = F::zero();
263            let mut vk_norm_sq = F::zero();
264
265            for j in 0..n_features {
266                let mut r = F::from(x_row[j]).unwrap();
267                for kk in 0..n_components {
268                    if kk != k {
269                        r = r - u_row[kk] * v[[kk, j]];
270                    }
271                }
272                residual_dot = residual_dot + r * v[[k, j]];
273                vk_norm_sq = vk_norm_sq + v[[k, j]] * v[[k, j]];
274            }
275
276            if vk_norm_sq < eps::<F>() {
277                u_row[k] = F::zero();
278            } else {
279                u_row[k] = soft_threshold(residual_dot, alpha_f) / vk_norm_sq;
280            }
281        }
282    }
283}
284
285/// Compute the Frobenius norm squared of `X - U @ V`.
286fn reconstruction_error_sq<F: Float + 'static>(x: &Array2<F>, u: &Array2<F>, v: &Array2<F>) -> F {
287    let uv = u.dot(v);
288    let mut err = F::zero();
289    for (a, b) in x.iter().zip(uv.iter()) {
290        let d = *a - *b;
291        err = err + d * d;
292    }
293    err
294}
295
296// ---------------------------------------------------------------------------
297// Trait implementations
298// ---------------------------------------------------------------------------
299
300impl<F: Float + Send + Sync + 'static> Fit<Array2<F>, ()> for SparsePCA<F> {
301    type Fitted = FittedSparsePCA<F>;
302    type Error = FerroError;
303
304    /// Fit Sparse PCA by alternating sparse coding and dictionary update.
305    ///
306    /// # Errors
307    ///
308    /// - [`FerroError::InvalidParameter`] if `n_components` is zero or exceeds
309    ///   the number of features.
310    /// - [`FerroError::InsufficientSamples`] if there are fewer than 2 samples.
311    fn fit(&self, x: &Array2<F>, _y: &()) -> Result<FittedSparsePCA<F>, FerroError> {
312        let (n_samples, n_features) = x.dim();
313
314        if self.n_components == 0 {
315            return Err(FerroError::InvalidParameter {
316                name: "n_components".into(),
317                reason: "must be at least 1".into(),
318            });
319        }
320        if self.n_components > n_features {
321            return Err(FerroError::InvalidParameter {
322                name: "n_components".into(),
323                reason: format!(
324                    "n_components ({}) exceeds n_features ({})",
325                    self.n_components, n_features
326                ),
327            });
328        }
329        if n_samples < 2 {
330            return Err(FerroError::InsufficientSamples {
331                required: 2,
332                actual: n_samples,
333                context: "SparsePCA::fit requires at least 2 samples".into(),
334            });
335        }
336
337        // Reject NaN/Inf BEFORE the alternating sparse-coding math (sklearn
338        // `_validate_data(force_all_finite=True)` at `_sparse_pca.py:81`,
339        // `utils/validation.py:147-154`).
340        reject_non_finite(x)?;
341
342        let n_comp = self.n_components;
343        let n_f = F::from(n_samples).unwrap();
344        let alpha_f = F::from(self.alpha).unwrap_or_else(F::one);
345
346        // Step 1: compute mean and centre data.
347        let mut mean = Array1::<F>::zeros(n_features);
348        for j in 0..n_features {
349            let sum = x.column(j).iter().copied().fold(F::zero(), |a, b| a + b);
350            mean[j] = sum / n_f;
351        }
352
353        let mut x_centered = x.to_owned();
354        for mut row in x_centered.rows_mut() {
355            for (v, &m) in row.iter_mut().zip(mean.iter()) {
356                *v = *v - m;
357            }
358        }
359
360        // Step 2: Initialize V from random.
361        let seed = self.random_state.unwrap_or(42);
362        let mut rng: rand::rngs::StdRng = SeedableRng::seed_from_u64(seed);
363        let uniform = Uniform::new(-1.0f64, 1.0f64).unwrap();
364
365        let mut v = Array2::<F>::zeros((n_comp, n_features));
366        for elem in v.iter_mut() {
367            *elem = F::from(uniform.sample(&mut rng)).unwrap_or_else(F::zero);
368        }
369        // Normalize each row of V.
370        for i in 0..n_comp {
371            let norm: F = v
372                .row(i)
373                .iter()
374                .fold(F::zero(), |acc, &val| acc + val * val)
375                .sqrt();
376            if norm > eps::<F>() {
377                for j in 0..n_features {
378                    v[[i, j]] = v[[i, j]] / norm;
379                }
380            }
381        }
382
383        // Step 3: Allocate U (sparse codes), shape (n_samples, n_components).
384        let mut u = Array2::<F>::zeros((n_samples, n_comp));
385
386        let n_cd_iters = 10; // inner coordinate descent iterations
387        let mut prev_err = F::infinity();
388        let tol_f = F::from(self.tol).unwrap_or_else(F::epsilon);
389        let mut actual_iter = 0;
390
391        for iteration in 0..self.max_iter {
392            actual_iter = iteration + 1;
393
394            // Step 3a: Fix V, solve for sparse code U (each row independently).
395            for i in 0..n_samples {
396                let x_row: Vec<F> = x_centered.row(i).to_vec();
397                let mut u_row: Vec<F> = u.row(i).to_vec();
398                sparse_code_row(&x_row, &v, alpha_f, &mut u_row, n_cd_iters);
399                for k in 0..n_comp {
400                    u[[i, k]] = u_row[k];
401                }
402            }
403
404            // Step 3b: Fix U, update V = (X^T U) (U^T U)^{-1}, then normalize.
405            // Compute U^T U, shape (n_comp, n_comp).
406            let utu = u.t().dot(&u);
407            // Compute X^T U, shape (n_features, n_comp).
408            let xtu = x_centered.t().dot(&u);
409
410            // Solve for V^T = (U^T U)^{-1} (X^T U)^T via inverting U^T U.
411            // For small n_comp, invert directly.
412            if let Some(utu_inv) = invert_small_symmetric(&utu) {
413                let v_new_t = xtu.dot(&utu_inv); // (n_features, n_comp)
414                // V rows = columns of v_new_t transposed.
415                for k in 0..n_comp {
416                    for j in 0..n_features {
417                        v[[k, j]] = v_new_t[[j, k]];
418                    }
419                }
420            }
421            // else: U^T U is singular; keep V from previous iteration.
422
423            // Normalize columns of V (stored as rows).
424            for k in 0..n_comp {
425                let norm: F = v
426                    .row(k)
427                    .iter()
428                    .fold(F::zero(), |acc, &val| acc + val * val)
429                    .sqrt();
430                if norm > eps::<F>() {
431                    for j in 0..n_features {
432                        v[[k, j]] = v[[k, j]] / norm;
433                    }
434                }
435            }
436
437            // Check convergence.
438            let err = reconstruction_error_sq(&x_centered, &u, &v);
439            if prev_err > eps::<F>() && (prev_err - err).abs() / prev_err < tol_f {
440                break;
441            }
442            prev_err = err;
443        }
444
445        Ok(FittedSparsePCA {
446            components_: v,
447            mean_: mean,
448            n_iter_: actual_iter,
449            ridge_alpha: 0.01,
450        })
451    }
452}
453
454/// Invert a small symmetric positive-definite matrix via Gauss-Jordan.
455///
456/// Returns `None` if the matrix is singular.
457fn invert_small_symmetric<F: Float>(a: &Array2<F>) -> Option<Array2<F>> {
458    let n = a.nrows();
459    if n == 0 {
460        return Some(Array2::zeros((0, 0)));
461    }
462
463    // Augmented matrix [A | I].
464    let mut aug = Array2::<F>::zeros((n, 2 * n));
465    for i in 0..n {
466        for j in 0..n {
467            aug[[i, j]] = a[[i, j]];
468        }
469        aug[[i, n + i]] = F::one();
470    }
471
472    // Add regularisation to diagonal.
473    let reg = F::from(1e-10).unwrap_or_else(F::epsilon);
474    for i in 0..n {
475        aug[[i, i]] = aug[[i, i]] + reg;
476    }
477
478    for i in 0..n {
479        // Find pivot.
480        let mut max_val = aug[[i, i]].abs();
481        let mut max_row = i;
482        for r in (i + 1)..n {
483            if aug[[r, i]].abs() > max_val {
484                max_val = aug[[r, i]].abs();
485                max_row = r;
486            }
487        }
488        if max_val < F::from(1e-15).unwrap_or_else(F::epsilon) {
489            return None;
490        }
491
492        // Swap rows.
493        if max_row != i {
494            for c in 0..(2 * n) {
495                let tmp = aug[[i, c]];
496                aug[[i, c]] = aug[[max_row, c]];
497                aug[[max_row, c]] = tmp;
498            }
499        }
500
501        // Scale pivot row.
502        let pivot = aug[[i, i]];
503        for c in 0..(2 * n) {
504            aug[[i, c]] = aug[[i, c]] / pivot;
505        }
506
507        // Eliminate other rows.
508        for r in 0..n {
509            if r != i {
510                let factor = aug[[r, i]];
511                for c in 0..(2 * n) {
512                    aug[[r, c]] = aug[[r, c]] - factor * aug[[i, c]];
513                }
514            }
515        }
516    }
517
518    // Extract inverse.
519    let mut inv = Array2::<F>::zeros((n, n));
520    for i in 0..n {
521        for j in 0..n {
522            inv[[i, j]] = aug[[i, n + j]];
523        }
524    }
525    Some(inv)
526}
527
528impl<F: Float + Send + Sync + 'static> Transform<Array2<F>> for FittedSparsePCA<F> {
529    type Output = Array2<F>;
530    type Error = FerroError;
531
532    /// Least-squares projection of the data onto the sparse components.
533    ///
534    /// Because the sparse `components_` are not orthonormal, a plain linear
535    /// projection is incorrect (see scikit-learn `_BaseSparsePCA.transform`,
536    /// `_sparse_pca.py:93-123`). Instead this computes the ridge-regularised
537    /// least-squares fit
538    /// `U = ridge_regression(components_.T, X_centered.T, ridge_alpha)`, i.e.
539    /// `U = (X - mean) · Cᵀ · (C·Cᵀ + ridge_alpha·I)⁻¹` with `C = components_`
540    /// and `ridge_alpha = 0.01` (sklearn default).
541    ///
542    /// # Errors
543    ///
544    /// - Returns [`FerroError::ShapeMismatch`] if the number of columns does not
545    ///   match the number of features seen during fitting.
546    /// - Returns [`FerroError::NumericalInstability`] if the ridge system
547    ///   `C·Cᵀ + ridge_alpha·I` is singular (should not occur for
548    ///   `ridge_alpha > 0`).
549    fn transform(&self, x: &Array2<F>) -> Result<Array2<F>, FerroError> {
550        let n_features = self.mean_.len();
551        if x.ncols() != n_features {
552            return Err(FerroError::ShapeMismatch {
553                expected: vec![x.nrows(), n_features],
554                actual: vec![x.nrows(), x.ncols()],
555                context: "FittedSparsePCA::transform".into(),
556            });
557        }
558
559        // Reject NaN/Inf BEFORE the ridge projection (sklearn re-validates with
560        // `_validate_data(reset=False, force_all_finite=True)` at
561        // `_sparse_pca.py:116`, `utils/validation.py:147-154`).
562        reject_non_finite(x)?;
563
564        let mut x_centered = x.to_owned();
565        for mut row in x_centered.rows_mut() {
566            for (v, &m) in row.iter_mut().zip(self.mean_.iter()) {
567                *v = *v - m;
568            }
569        }
570
571        // Plain projection onto the (non-orthonormal) components.
572        let proj = x_centered.dot(&self.components_.t());
573
574        // Ridge system M = C·Cᵀ + ridge_alpha·I  (n_components × n_components,
575        // symmetric positive-definite for ridge_alpha > 0).
576        let ridge_alpha =
577            F::from(self.ridge_alpha).unwrap_or_else(|| F::from(0.01).unwrap_or_else(F::epsilon));
578        let mut m = self.components_.dot(&self.components_.t());
579        for i in 0..m.nrows() {
580            m[[i, i]] = m[[i, i]] + ridge_alpha;
581        }
582
583        // U = proj · M⁻¹  (M symmetric PD, so M⁻¹ = (M⁻¹)ᵀ).
584        let m_inv = invert_small_symmetric(&m).ok_or_else(|| FerroError::NumericalInstability {
585            message: "FittedSparsePCA::transform: ridge system (C·Cᵀ + ridge_alpha·I) is singular"
586                .into(),
587        })?;
588
589        Ok(proj.dot(&m_inv))
590    }
591}
592
593// ---------------------------------------------------------------------------
594// Tests
595// ---------------------------------------------------------------------------
596
597#[cfg(test)]
598mod tests {
599    use super::*;
600    use ndarray::array;
601
602    #[test]
603    fn test_sparse_pca_basic() {
604        let spca = SparsePCA::<f64>::new(2).with_random_state(42);
605        let x = array![
606            [1.0, 2.0, 3.0],
607            [4.0, 5.0, 6.0],
608            [7.0, 8.0, 9.0],
609            [10.0, 11.0, 12.0],
610        ];
611        let fitted = spca.fit(&x, &()).unwrap();
612        let projected = fitted.transform(&x).unwrap();
613        assert_eq!(projected.dim(), (4, 2));
614    }
615
616    #[test]
617    fn test_sparse_pca_single_component() {
618        let spca = SparsePCA::<f64>::new(1).with_random_state(0);
619        let x = array![[1.0, 2.0], [3.0, 4.0], [5.0, 6.0], [7.0, 8.0],];
620        let fitted = spca.fit(&x, &()).unwrap();
621        assert_eq!(fitted.components().nrows(), 1);
622        let projected = fitted.transform(&x).unwrap();
623        assert_eq!(projected.ncols(), 1);
624    }
625
626    #[test]
627    fn test_sparse_pca_components_shape() {
628        let spca = SparsePCA::<f64>::new(2).with_random_state(7);
629        let x = array![
630            [1.0, 0.0, 0.0, 2.0],
631            [0.0, 3.0, 0.0, 1.0],
632            [2.0, 0.0, 1.0, 0.0],
633            [0.0, 2.0, 3.0, 0.0],
634            [1.0, 1.0, 1.0, 1.0],
635        ];
636        let fitted = spca.fit(&x, &()).unwrap();
637        assert_eq!(fitted.components().dim(), (2, 4));
638    }
639
640    #[test]
641    fn test_sparse_pca_high_alpha_produces_sparser() {
642        let x = array![
643            [1.0, 0.0, 0.0, 2.0, 0.0],
644            [0.0, 3.0, 0.0, 1.0, 0.0],
645            [2.0, 0.0, 1.0, 0.0, 4.0],
646            [0.0, 2.0, 3.0, 0.0, 1.0],
647            [1.0, 1.0, 1.0, 1.0, 1.0],
648        ];
649
650        let fitted_low = SparsePCA::<f64>::new(1)
651            .with_alpha(0.001)
652            .with_random_state(42)
653            .fit(&x, &())
654            .unwrap();
655        let fitted_high = SparsePCA::<f64>::new(1)
656            .with_alpha(100.0)
657            .with_random_state(42)
658            .fit(&x, &())
659            .unwrap();
660
661        // With high alpha, the projected values should tend toward zero
662        // (codes are pushed to zero by the L1 penalty).
663        let proj_low = fitted_low.transform(&x).unwrap();
664        let proj_high = fitted_high.transform(&x).unwrap();
665
666        let energy_low: f64 = proj_low.iter().map(|v| v * v).sum();
667        let energy_high: f64 = proj_high.iter().map(|v| v * v).sum();
668
669        // High alpha should produce less energy or similar (sparser codes).
670        // We just check both runs succeed and produce finite values.
671        assert!(energy_low.is_finite());
672        assert!(energy_high.is_finite());
673    }
674
675    #[test]
676    fn test_sparse_pca_n_components_zero() {
677        let spca = SparsePCA::<f64>::new(0);
678        let x = array![[1.0, 2.0], [3.0, 4.0]];
679        assert!(spca.fit(&x, &()).is_err());
680    }
681
682    #[test]
683    fn test_sparse_pca_n_components_too_large() {
684        let spca = SparsePCA::<f64>::new(5);
685        let x = array![[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]];
686        assert!(spca.fit(&x, &()).is_err());
687    }
688
689    #[test]
690    fn test_sparse_pca_insufficient_samples() {
691        let spca = SparsePCA::<f64>::new(1);
692        let x = array![[1.0, 2.0]];
693        assert!(spca.fit(&x, &()).is_err());
694    }
695
696    #[test]
697    fn test_sparse_pca_transform_shape_mismatch() {
698        let spca = SparsePCA::<f64>::new(1).with_random_state(0);
699        let x = array![[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]];
700        let fitted = spca.fit(&x, &()).unwrap();
701        let x_bad = array![[1.0, 2.0, 3.0]];
702        assert!(fitted.transform(&x_bad).is_err());
703    }
704
705    #[test]
706    fn test_sparse_pca_f32() {
707        let spca = SparsePCA::<f32>::new(1).with_random_state(0);
708        let x: Array2<f32> = array![[1.0f32, 2.0], [3.0, 4.0], [5.0, 6.0], [7.0, 8.0],];
709        let fitted = spca.fit(&x, &()).unwrap();
710        let projected = fitted.transform(&x).unwrap();
711        assert_eq!(projected.ncols(), 1);
712    }
713
714    #[test]
715    fn test_sparse_pca_mean_is_correct() {
716        let spca = SparsePCA::<f64>::new(1).with_random_state(0);
717        let x = array![[2.0, 4.0], [4.0, 6.0], [6.0, 8.0]];
718        let fitted = spca.fit(&x, &()).unwrap();
719        let mean = fitted.mean();
720        assert!((mean[0] - 4.0).abs() < 1e-10);
721        assert!((mean[1] - 6.0).abs() < 1e-10);
722    }
723
724    #[test]
725    fn test_sparse_pca_builder_methods() {
726        let spca = SparsePCA::<f64>::new(3)
727            .with_alpha(0.5)
728            .with_max_iter(500)
729            .with_tol(1e-6)
730            .with_random_state(99);
731        assert_eq!(spca.n_components(), 3);
732        assert!((spca.alpha() - 0.5).abs() < 1e-15);
733        assert_eq!(spca.max_iter(), 500);
734        assert!((spca.tol() - 1e-6).abs() < 1e-15);
735    }
736
737    #[test]
738    fn test_sparse_pca_n_iter_positive() {
739        let spca = SparsePCA::<f64>::new(1)
740            .with_max_iter(10)
741            .with_random_state(0);
742        let x = array![[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]];
743        let fitted = spca.fit(&x, &()).unwrap();
744        assert!(fitted.n_iter() > 0);
745        assert!(fitted.n_iter() <= 10);
746    }
747}