ferrolearn_decomp/sparse_pca.rs
1//! Sparse Principal Component Analysis (SparsePCA).
2//!
3//! SparsePCA finds sparse components that can optimally reconstruct the data
4//! by combining PCA with L1 (lasso) penalisation on the loadings. This
5//! produces components that are easier to interpret, at the cost of
6//! explained variance compared to standard PCA.
7//!
8//! # Algorithm
9//!
10//! Uses an Elastic-Net / Coordinate-Descent approach:
11//!
12//! 1. Initialise dictionary `V` from PCA (or random).
13//! 2. Alternate:
14//! a. Fix `V`, solve for sparse code `U` via coordinate descent:
15//! `min ||X - U V^T||^2 + alpha * ||U||_1` (per row of `U`).
16//! b. Fix `U`, update `V = X^T U (U^T U)^{-1}`, then normalise columns.
17//! 3. The rows of `V` are the sparse principal components.
18//!
19//! # Examples
20//!
21//! ```
22//! use ferrolearn_decomp::SparsePCA;
23//! use ferrolearn_core::traits::{Fit, Transform};
24//! use ndarray::array;
25//!
26//! let spca = SparsePCA::<f64>::new(1);
27//! let x = array![
28//! [1.0, 2.0, 3.0],
29//! [4.0, 5.0, 6.0],
30//! [7.0, 8.0, 9.0],
31//! [10.0, 11.0, 12.0],
32//! ];
33//! let fitted = spca.fit(&x, &()).unwrap();
34//! let projected = fitted.transform(&x).unwrap();
35//! assert_eq!(projected.ncols(), 1);
36//! ```
37//!
38//! ## REQ status
39//!
40//! Design: `.design/decomp/sparse_pca.md`. Tracking: #1476. Each REQ is BINARY —
41//! SHIPPED (impl + non-test consumer + tests + green verification) or NOT-STARTED
42//! (concrete open blocker). Non-test consumers: crate re-export (`lib.rs:99`) and
43//! the `_RsSparsePCA` PyO3 binding (`extras.rs:1129`, registered `lib.rs:77`).
44//! Oracle = live sklearn 1.5.2 (`_sparse_pca.py`), run from `/tmp` (R-CHAR-3).
45//!
46//! | REQ | Scope | Status | Evidence / Blocker |
47//! |---|---|---|---|
48//! | REQ-1 | `transform` = RIDGE regression (`ridge_alpha=0.01`), not plain projection | SHIPPED | `transform` solves `U = (X−mean_)·Cᵀ·(C·Cᵀ + 0.01·I)⁻¹` == `ridge_regression(components_.T, X_centered.T, 0.01, solver="cholesky")` (sklearn `_sparse_pca.py:119-121`); matches the sklearn ridge oracle on ferrolearn's own fitted components to 1e-6 in `tests/divergence_sparse_pca.rs::divergence_transform_is_ridge_not_projection` (was #1477, fixed). Consumers: re-export `lib.rs:99`, `_RsSparsePCA` `extras.rs:1129` |
49//! | REQ-2 | Structural: components sparsity (L1→exact zeros), shape `(n_components,n_features)`, mean centering, error-decrease, determinism | SHIPPED (scoped) | `fn fit` centers via per-feature `mean` (= `mean_ = X.mean(axis=0)` `_sparse_pca.py:83`), alternating soft-threshold sparse-coding + dictionary update, seeded `StdRng` ⇒ deterministic. Green-guards in `tests/divergence_sparse_pca.rs` + in-module tests. STRUCTURAL only, NOT component VALUES (REQ-4) |
50//! | REQ-3 | Error / parameter contracts (n_components 0/>n_features, <2 samples, transform col mismatch, NON-FINITE rejection) | SHIPPED (scoped) | `fn fit` guards (`InvalidParameter`/`InsufficientSamples`); `transform` `ShapeMismatch`. NON-FINITE: `fit`+`transform` call `reject_non_finite` (`sparse_pca.rs` symbol `reject_non_finite`) BEFORE the sparse-coding/ridge math, returning the CLEAN finiteness `InvalidParameter{name:"X", reason:"Input X contains NaN or infinity."}` = sklearn `_validate_data(force_all_finite=True)` (`_sparse_pca.py:81`,`:116`,`utils/validation.py:147-154`) — the real input gate (the `is_finite` at `:641-642` is TEST-only). `tests/divergence_nonfinite_spillover.rs::divergence_sparse_pca_fit_nan`/`_transform_nan` match the live sklearn 1.5.2 oracle (#2290). FLAG: sklearn raises `InvalidParameterError`, accepts `n_components=None`, does not pre-reject `n_components>n_features`/`n_samples<2` |
51//! | REQ-4 | EXACT `components_` value parity with sklearn `dict_learning` | NOT-STARTED | CARVE-OUT (R-DEFER-3): LARS lasso + numpy RNG + `svd_flip` + per-feature alpha scaling (`_sparse_pca.py:308-336`, `_dict_learning.py:120`) vs ferrolearn random-init soft-threshold-CD + LS update — blocker #1478 |
52//! | REQ-5 | `ridge_alpha` ctor/builder param exposure | NOT-STARTED | sklearn `SparsePCA(ridge_alpha=0.01)` `_sparse_pca.py:284`; ferrolearn hard-codes 0.01 in `transform` (REQ-1) — blocker #1479 |
53//! | REQ-6 | `method` (`lars`/`cd`) field + LARS sparse-coding path | NOT-STARTED | sklearn `method="lars"` → `dict_learning(method=)` `_sparse_pca.py:287/:319`; ferrolearn single soft-threshold CD coder — blocker #1480 |
54//! | REQ-7 | `U_init`/`V_init` warm restart + alpha/n_features scaling | NOT-STARTED | sklearn `_sparse_pca.py:289-290`, `_dict_learning.py:120`; ferrolearn always random-inits `V`, un-scaled alpha — blocker #1481 |
55//! | REQ-8 | `MiniBatchSparsePCA` online variant | NOT-STARTED | sklearn `class MiniBatchSparsePCA(_BaseSparsePCA)` `_sparse_pca.py:339-552`; ABSENT in ferrolearn — blocker #1482 |
56//! | REQ-9 | `inverse_transform` + `error_`/`n_components_`/`n_features_in_` attrs | NOT-STARTED | sklearn `_sparse_pca.py:125-146/:223/:226/:238`; `FittedSparsePCA` exposes only `components()`/`mean()`/`n_iter()` — blocker #1483 |
57//! | REQ-10 | PyO3 binding surface (thin: `n_components` ctor + `fit` + `transform`) | SHIPPED (scoped) | `_RsSparsePCA` (`extras.rs:1129`, registered `lib.rs:77`) via `py_transformer!`; inherits the REQ-1 ridge transform. NOT `alpha`/`tol`/`random_state` params, NOT `components_` accessor, NOT `inverse_transform` (REQ-5/9) |
58//! | REQ-11 | ferray substrate | NOT-STARTED | dense `ndarray` + hand-rolled `invert_small_symmetric` — blocker #1484 |
59//!
60//! Count: **4 SHIPPED (REQ-1,2,3,10) / 7 NOT-STARTED (REQ-4,5,6,7,8,9,11)**.
61
62use ferrolearn_core::error::FerroError;
63use ferrolearn_core::traits::{Fit, Transform};
64use ndarray::{Array1, Array2};
65use num_traits::Float;
66use rand::SeedableRng;
67use rand_distr::{Distribution, Uniform};
68
69/// Reject non-finite input the way sklearn's `_validate_data` does.
70///
71/// sklearn runs `check_array` with the default `force_all_finite=True` at the
72/// top of `SparsePCA.fit`/`transform` (`sklearn/decomposition/_sparse_pca.py:81`,
73/// `:116`), raising `ValueError("Input X contains NaN.")` /
74/// `"... contains infinity ..."` (`sklearn/utils/validation.py:147-154`) BEFORE
75/// any decomposition / ridge math. NaN AND infinity are both rejected. The
76/// message names "NaN" and "infinity" to mirror sklearn's `ValueError`. Never
77/// panics (R-CODE-2). (The `is_finite` checks in `#[cfg(test)]` below are
78/// test-only and do NOT gate input — this guard is the real input gate.)
79fn reject_non_finite<F: Float>(x: &Array2<F>) -> Result<(), FerroError> {
80 if x.iter().any(|v| !v.is_finite()) {
81 return Err(FerroError::InvalidParameter {
82 name: "X".into(),
83 reason: "Input X contains NaN or infinity.".into(),
84 });
85 }
86 Ok(())
87}
88
89// ---------------------------------------------------------------------------
90// SparsePCA (unfitted)
91// ---------------------------------------------------------------------------
92
93/// Sparse PCA configuration.
94///
95/// Holds hyperparameters for the Sparse PCA decomposition. Calling
96/// [`Fit::fit`] performs the iterative elastic-net / coordinate-descent
97/// procedure and returns a [`FittedSparsePCA`] that can project new data.
98#[derive(Debug, Clone)]
99pub struct SparsePCA<F> {
100 /// Number of sparse components to extract.
101 n_components: usize,
102 /// Sparsity penalty weight on the L1 norm of the loadings.
103 alpha: f64,
104 /// Maximum number of outer iterations.
105 max_iter: usize,
106 /// Convergence tolerance on the relative change in reconstruction error.
107 tol: f64,
108 /// Optional random seed for reproducibility.
109 random_state: Option<u64>,
110 _marker: std::marker::PhantomData<F>,
111}
112
113impl<F: Float + Send + Sync + 'static> SparsePCA<F> {
114 /// Create a new `SparsePCA` that extracts `n_components` sparse components.
115 ///
116 /// Defaults: `alpha = 1.0`, `max_iter = 1000`, `tol = 1e-8`,
117 /// `random_state = None`.
118 #[must_use]
119 pub fn new(n_components: usize) -> Self {
120 Self {
121 n_components,
122 alpha: 1.0,
123 max_iter: 1000,
124 tol: 1e-8,
125 random_state: None,
126 _marker: std::marker::PhantomData,
127 }
128 }
129
130 /// Set the sparsity penalty weight (L1 regularisation on codes).
131 #[must_use]
132 pub fn with_alpha(mut self, alpha: f64) -> Self {
133 self.alpha = alpha;
134 self
135 }
136
137 /// Set the maximum number of outer iterations.
138 #[must_use]
139 pub fn with_max_iter(mut self, max_iter: usize) -> Self {
140 self.max_iter = max_iter;
141 self
142 }
143
144 /// Set the convergence tolerance.
145 #[must_use]
146 pub fn with_tol(mut self, tol: f64) -> Self {
147 self.tol = tol;
148 self
149 }
150
151 /// Set the random seed for reproducible results.
152 #[must_use]
153 pub fn with_random_state(mut self, seed: u64) -> Self {
154 self.random_state = Some(seed);
155 self
156 }
157
158 /// Return the configured number of components.
159 #[must_use]
160 pub fn n_components(&self) -> usize {
161 self.n_components
162 }
163
164 /// Return the configured sparsity penalty.
165 #[must_use]
166 pub fn alpha(&self) -> f64 {
167 self.alpha
168 }
169
170 /// Return the configured maximum iterations.
171 #[must_use]
172 pub fn max_iter(&self) -> usize {
173 self.max_iter
174 }
175
176 /// Return the configured tolerance.
177 #[must_use]
178 pub fn tol(&self) -> f64 {
179 self.tol
180 }
181}
182
183// ---------------------------------------------------------------------------
184// FittedSparsePCA
185// ---------------------------------------------------------------------------
186
187/// A fitted Sparse PCA model holding the learned components.
188///
189/// Created by calling [`Fit::fit`] on a [`SparsePCA`]. Implements
190/// [`Transform<Array2<F>>`] to project new data onto the sparse components.
191#[derive(Debug, Clone)]
192pub struct FittedSparsePCA<F> {
193 /// Sparse components, shape `(n_components, n_features)`.
194 components_: Array2<F>,
195 /// Per-feature mean computed during fitting (used for centring).
196 mean_: Array1<F>,
197 /// Number of outer iterations performed.
198 n_iter_: usize,
199 /// Ridge shrinkage applied during `transform` (sklearn default `0.01`).
200 ridge_alpha: f64,
201}
202
203impl<F: Float + Send + Sync + 'static> FittedSparsePCA<F> {
204 /// Sparse components, shape `(n_components, n_features)`.
205 #[must_use]
206 pub fn components(&self) -> &Array2<F> {
207 &self.components_
208 }
209
210 /// Per-feature mean learned during fitting.
211 #[must_use]
212 pub fn mean(&self) -> &Array1<F> {
213 &self.mean_
214 }
215
216 /// Number of outer iterations performed.
217 #[must_use]
218 pub fn n_iter(&self) -> usize {
219 self.n_iter_
220 }
221}
222
223// ---------------------------------------------------------------------------
224// Internal helpers
225// ---------------------------------------------------------------------------
226
227/// Small epsilon to prevent division by zero.
228#[inline]
229fn eps<F: Float>() -> F {
230 F::from(1e-12).unwrap_or_else(F::epsilon)
231}
232
233/// Soft-thresholding operator: sign(x) * max(|x| - threshold, 0).
234#[inline]
235fn soft_threshold<F: Float>(x: F, threshold: F) -> F {
236 if x > threshold {
237 x - threshold
238 } else if x < -threshold {
239 x + threshold
240 } else {
241 F::zero()
242 }
243}
244
245/// Solve sparse coding for a single row of U via coordinate descent:
246/// min_u ||x_row - u V^T||^2 + alpha * ||u||_1
247///
248/// `v` has shape `(n_components, n_features)`.
249fn sparse_code_row<F: Float>(
250 x_row: &[F],
251 v: &Array2<F>,
252 alpha_f: F,
253 u_row: &mut [F],
254 n_cd_iters: usize,
255) {
256 let n_components = v.nrows();
257 let n_features = v.ncols();
258
259 for _iter in 0..n_cd_iters {
260 for k in 0..n_components {
261 // Compute residual excluding component k.
262 let mut residual_dot = F::zero();
263 let mut vk_norm_sq = F::zero();
264
265 for j in 0..n_features {
266 let mut r = F::from(x_row[j]).unwrap();
267 for kk in 0..n_components {
268 if kk != k {
269 r = r - u_row[kk] * v[[kk, j]];
270 }
271 }
272 residual_dot = residual_dot + r * v[[k, j]];
273 vk_norm_sq = vk_norm_sq + v[[k, j]] * v[[k, j]];
274 }
275
276 if vk_norm_sq < eps::<F>() {
277 u_row[k] = F::zero();
278 } else {
279 u_row[k] = soft_threshold(residual_dot, alpha_f) / vk_norm_sq;
280 }
281 }
282 }
283}
284
285/// Compute the Frobenius norm squared of `X - U @ V`.
286fn reconstruction_error_sq<F: Float + 'static>(x: &Array2<F>, u: &Array2<F>, v: &Array2<F>) -> F {
287 let uv = u.dot(v);
288 let mut err = F::zero();
289 for (a, b) in x.iter().zip(uv.iter()) {
290 let d = *a - *b;
291 err = err + d * d;
292 }
293 err
294}
295
296// ---------------------------------------------------------------------------
297// Trait implementations
298// ---------------------------------------------------------------------------
299
300impl<F: Float + Send + Sync + 'static> Fit<Array2<F>, ()> for SparsePCA<F> {
301 type Fitted = FittedSparsePCA<F>;
302 type Error = FerroError;
303
304 /// Fit Sparse PCA by alternating sparse coding and dictionary update.
305 ///
306 /// # Errors
307 ///
308 /// - [`FerroError::InvalidParameter`] if `n_components` is zero or exceeds
309 /// the number of features.
310 /// - [`FerroError::InsufficientSamples`] if there are fewer than 2 samples.
311 fn fit(&self, x: &Array2<F>, _y: &()) -> Result<FittedSparsePCA<F>, FerroError> {
312 let (n_samples, n_features) = x.dim();
313
314 if self.n_components == 0 {
315 return Err(FerroError::InvalidParameter {
316 name: "n_components".into(),
317 reason: "must be at least 1".into(),
318 });
319 }
320 if self.n_components > n_features {
321 return Err(FerroError::InvalidParameter {
322 name: "n_components".into(),
323 reason: format!(
324 "n_components ({}) exceeds n_features ({})",
325 self.n_components, n_features
326 ),
327 });
328 }
329 if n_samples < 2 {
330 return Err(FerroError::InsufficientSamples {
331 required: 2,
332 actual: n_samples,
333 context: "SparsePCA::fit requires at least 2 samples".into(),
334 });
335 }
336
337 // Reject NaN/Inf BEFORE the alternating sparse-coding math (sklearn
338 // `_validate_data(force_all_finite=True)` at `_sparse_pca.py:81`,
339 // `utils/validation.py:147-154`).
340 reject_non_finite(x)?;
341
342 let n_comp = self.n_components;
343 let n_f = F::from(n_samples).unwrap();
344 let alpha_f = F::from(self.alpha).unwrap_or_else(F::one);
345
346 // Step 1: compute mean and centre data.
347 let mut mean = Array1::<F>::zeros(n_features);
348 for j in 0..n_features {
349 let sum = x.column(j).iter().copied().fold(F::zero(), |a, b| a + b);
350 mean[j] = sum / n_f;
351 }
352
353 let mut x_centered = x.to_owned();
354 for mut row in x_centered.rows_mut() {
355 for (v, &m) in row.iter_mut().zip(mean.iter()) {
356 *v = *v - m;
357 }
358 }
359
360 // Step 2: Initialize V from random.
361 let seed = self.random_state.unwrap_or(42);
362 let mut rng: rand::rngs::StdRng = SeedableRng::seed_from_u64(seed);
363 let uniform = Uniform::new(-1.0f64, 1.0f64).unwrap();
364
365 let mut v = Array2::<F>::zeros((n_comp, n_features));
366 for elem in v.iter_mut() {
367 *elem = F::from(uniform.sample(&mut rng)).unwrap_or_else(F::zero);
368 }
369 // Normalize each row of V.
370 for i in 0..n_comp {
371 let norm: F = v
372 .row(i)
373 .iter()
374 .fold(F::zero(), |acc, &val| acc + val * val)
375 .sqrt();
376 if norm > eps::<F>() {
377 for j in 0..n_features {
378 v[[i, j]] = v[[i, j]] / norm;
379 }
380 }
381 }
382
383 // Step 3: Allocate U (sparse codes), shape (n_samples, n_components).
384 let mut u = Array2::<F>::zeros((n_samples, n_comp));
385
386 let n_cd_iters = 10; // inner coordinate descent iterations
387 let mut prev_err = F::infinity();
388 let tol_f = F::from(self.tol).unwrap_or_else(F::epsilon);
389 let mut actual_iter = 0;
390
391 for iteration in 0..self.max_iter {
392 actual_iter = iteration + 1;
393
394 // Step 3a: Fix V, solve for sparse code U (each row independently).
395 for i in 0..n_samples {
396 let x_row: Vec<F> = x_centered.row(i).to_vec();
397 let mut u_row: Vec<F> = u.row(i).to_vec();
398 sparse_code_row(&x_row, &v, alpha_f, &mut u_row, n_cd_iters);
399 for k in 0..n_comp {
400 u[[i, k]] = u_row[k];
401 }
402 }
403
404 // Step 3b: Fix U, update V = (X^T U) (U^T U)^{-1}, then normalize.
405 // Compute U^T U, shape (n_comp, n_comp).
406 let utu = u.t().dot(&u);
407 // Compute X^T U, shape (n_features, n_comp).
408 let xtu = x_centered.t().dot(&u);
409
410 // Solve for V^T = (U^T U)^{-1} (X^T U)^T via inverting U^T U.
411 // For small n_comp, invert directly.
412 if let Some(utu_inv) = invert_small_symmetric(&utu) {
413 let v_new_t = xtu.dot(&utu_inv); // (n_features, n_comp)
414 // V rows = columns of v_new_t transposed.
415 for k in 0..n_comp {
416 for j in 0..n_features {
417 v[[k, j]] = v_new_t[[j, k]];
418 }
419 }
420 }
421 // else: U^T U is singular; keep V from previous iteration.
422
423 // Normalize columns of V (stored as rows).
424 for k in 0..n_comp {
425 let norm: F = v
426 .row(k)
427 .iter()
428 .fold(F::zero(), |acc, &val| acc + val * val)
429 .sqrt();
430 if norm > eps::<F>() {
431 for j in 0..n_features {
432 v[[k, j]] = v[[k, j]] / norm;
433 }
434 }
435 }
436
437 // Check convergence.
438 let err = reconstruction_error_sq(&x_centered, &u, &v);
439 if prev_err > eps::<F>() && (prev_err - err).abs() / prev_err < tol_f {
440 break;
441 }
442 prev_err = err;
443 }
444
445 Ok(FittedSparsePCA {
446 components_: v,
447 mean_: mean,
448 n_iter_: actual_iter,
449 ridge_alpha: 0.01,
450 })
451 }
452}
453
454/// Invert a small symmetric positive-definite matrix via Gauss-Jordan.
455///
456/// Returns `None` if the matrix is singular.
457fn invert_small_symmetric<F: Float>(a: &Array2<F>) -> Option<Array2<F>> {
458 let n = a.nrows();
459 if n == 0 {
460 return Some(Array2::zeros((0, 0)));
461 }
462
463 // Augmented matrix [A | I].
464 let mut aug = Array2::<F>::zeros((n, 2 * n));
465 for i in 0..n {
466 for j in 0..n {
467 aug[[i, j]] = a[[i, j]];
468 }
469 aug[[i, n + i]] = F::one();
470 }
471
472 // Add regularisation to diagonal.
473 let reg = F::from(1e-10).unwrap_or_else(F::epsilon);
474 for i in 0..n {
475 aug[[i, i]] = aug[[i, i]] + reg;
476 }
477
478 for i in 0..n {
479 // Find pivot.
480 let mut max_val = aug[[i, i]].abs();
481 let mut max_row = i;
482 for r in (i + 1)..n {
483 if aug[[r, i]].abs() > max_val {
484 max_val = aug[[r, i]].abs();
485 max_row = r;
486 }
487 }
488 if max_val < F::from(1e-15).unwrap_or_else(F::epsilon) {
489 return None;
490 }
491
492 // Swap rows.
493 if max_row != i {
494 for c in 0..(2 * n) {
495 let tmp = aug[[i, c]];
496 aug[[i, c]] = aug[[max_row, c]];
497 aug[[max_row, c]] = tmp;
498 }
499 }
500
501 // Scale pivot row.
502 let pivot = aug[[i, i]];
503 for c in 0..(2 * n) {
504 aug[[i, c]] = aug[[i, c]] / pivot;
505 }
506
507 // Eliminate other rows.
508 for r in 0..n {
509 if r != i {
510 let factor = aug[[r, i]];
511 for c in 0..(2 * n) {
512 aug[[r, c]] = aug[[r, c]] - factor * aug[[i, c]];
513 }
514 }
515 }
516 }
517
518 // Extract inverse.
519 let mut inv = Array2::<F>::zeros((n, n));
520 for i in 0..n {
521 for j in 0..n {
522 inv[[i, j]] = aug[[i, n + j]];
523 }
524 }
525 Some(inv)
526}
527
528impl<F: Float + Send + Sync + 'static> Transform<Array2<F>> for FittedSparsePCA<F> {
529 type Output = Array2<F>;
530 type Error = FerroError;
531
532 /// Least-squares projection of the data onto the sparse components.
533 ///
534 /// Because the sparse `components_` are not orthonormal, a plain linear
535 /// projection is incorrect (see scikit-learn `_BaseSparsePCA.transform`,
536 /// `_sparse_pca.py:93-123`). Instead this computes the ridge-regularised
537 /// least-squares fit
538 /// `U = ridge_regression(components_.T, X_centered.T, ridge_alpha)`, i.e.
539 /// `U = (X - mean) · Cᵀ · (C·Cᵀ + ridge_alpha·I)⁻¹` with `C = components_`
540 /// and `ridge_alpha = 0.01` (sklearn default).
541 ///
542 /// # Errors
543 ///
544 /// - Returns [`FerroError::ShapeMismatch`] if the number of columns does not
545 /// match the number of features seen during fitting.
546 /// - Returns [`FerroError::NumericalInstability`] if the ridge system
547 /// `C·Cᵀ + ridge_alpha·I` is singular (should not occur for
548 /// `ridge_alpha > 0`).
549 fn transform(&self, x: &Array2<F>) -> Result<Array2<F>, FerroError> {
550 let n_features = self.mean_.len();
551 if x.ncols() != n_features {
552 return Err(FerroError::ShapeMismatch {
553 expected: vec![x.nrows(), n_features],
554 actual: vec![x.nrows(), x.ncols()],
555 context: "FittedSparsePCA::transform".into(),
556 });
557 }
558
559 // Reject NaN/Inf BEFORE the ridge projection (sklearn re-validates with
560 // `_validate_data(reset=False, force_all_finite=True)` at
561 // `_sparse_pca.py:116`, `utils/validation.py:147-154`).
562 reject_non_finite(x)?;
563
564 let mut x_centered = x.to_owned();
565 for mut row in x_centered.rows_mut() {
566 for (v, &m) in row.iter_mut().zip(self.mean_.iter()) {
567 *v = *v - m;
568 }
569 }
570
571 // Plain projection onto the (non-orthonormal) components.
572 let proj = x_centered.dot(&self.components_.t());
573
574 // Ridge system M = C·Cᵀ + ridge_alpha·I (n_components × n_components,
575 // symmetric positive-definite for ridge_alpha > 0).
576 let ridge_alpha =
577 F::from(self.ridge_alpha).unwrap_or_else(|| F::from(0.01).unwrap_or_else(F::epsilon));
578 let mut m = self.components_.dot(&self.components_.t());
579 for i in 0..m.nrows() {
580 m[[i, i]] = m[[i, i]] + ridge_alpha;
581 }
582
583 // U = proj · M⁻¹ (M symmetric PD, so M⁻¹ = (M⁻¹)ᵀ).
584 let m_inv = invert_small_symmetric(&m).ok_or_else(|| FerroError::NumericalInstability {
585 message: "FittedSparsePCA::transform: ridge system (C·Cᵀ + ridge_alpha·I) is singular"
586 .into(),
587 })?;
588
589 Ok(proj.dot(&m_inv))
590 }
591}
592
593// ---------------------------------------------------------------------------
594// Tests
595// ---------------------------------------------------------------------------
596
597#[cfg(test)]
598mod tests {
599 use super::*;
600 use ndarray::array;
601
602 #[test]
603 fn test_sparse_pca_basic() {
604 let spca = SparsePCA::<f64>::new(2).with_random_state(42);
605 let x = array![
606 [1.0, 2.0, 3.0],
607 [4.0, 5.0, 6.0],
608 [7.0, 8.0, 9.0],
609 [10.0, 11.0, 12.0],
610 ];
611 let fitted = spca.fit(&x, &()).unwrap();
612 let projected = fitted.transform(&x).unwrap();
613 assert_eq!(projected.dim(), (4, 2));
614 }
615
616 #[test]
617 fn test_sparse_pca_single_component() {
618 let spca = SparsePCA::<f64>::new(1).with_random_state(0);
619 let x = array![[1.0, 2.0], [3.0, 4.0], [5.0, 6.0], [7.0, 8.0],];
620 let fitted = spca.fit(&x, &()).unwrap();
621 assert_eq!(fitted.components().nrows(), 1);
622 let projected = fitted.transform(&x).unwrap();
623 assert_eq!(projected.ncols(), 1);
624 }
625
626 #[test]
627 fn test_sparse_pca_components_shape() {
628 let spca = SparsePCA::<f64>::new(2).with_random_state(7);
629 let x = array![
630 [1.0, 0.0, 0.0, 2.0],
631 [0.0, 3.0, 0.0, 1.0],
632 [2.0, 0.0, 1.0, 0.0],
633 [0.0, 2.0, 3.0, 0.0],
634 [1.0, 1.0, 1.0, 1.0],
635 ];
636 let fitted = spca.fit(&x, &()).unwrap();
637 assert_eq!(fitted.components().dim(), (2, 4));
638 }
639
640 #[test]
641 fn test_sparse_pca_high_alpha_produces_sparser() {
642 let x = array![
643 [1.0, 0.0, 0.0, 2.0, 0.0],
644 [0.0, 3.0, 0.0, 1.0, 0.0],
645 [2.0, 0.0, 1.0, 0.0, 4.0],
646 [0.0, 2.0, 3.0, 0.0, 1.0],
647 [1.0, 1.0, 1.0, 1.0, 1.0],
648 ];
649
650 let fitted_low = SparsePCA::<f64>::new(1)
651 .with_alpha(0.001)
652 .with_random_state(42)
653 .fit(&x, &())
654 .unwrap();
655 let fitted_high = SparsePCA::<f64>::new(1)
656 .with_alpha(100.0)
657 .with_random_state(42)
658 .fit(&x, &())
659 .unwrap();
660
661 // With high alpha, the projected values should tend toward zero
662 // (codes are pushed to zero by the L1 penalty).
663 let proj_low = fitted_low.transform(&x).unwrap();
664 let proj_high = fitted_high.transform(&x).unwrap();
665
666 let energy_low: f64 = proj_low.iter().map(|v| v * v).sum();
667 let energy_high: f64 = proj_high.iter().map(|v| v * v).sum();
668
669 // High alpha should produce less energy or similar (sparser codes).
670 // We just check both runs succeed and produce finite values.
671 assert!(energy_low.is_finite());
672 assert!(energy_high.is_finite());
673 }
674
675 #[test]
676 fn test_sparse_pca_n_components_zero() {
677 let spca = SparsePCA::<f64>::new(0);
678 let x = array![[1.0, 2.0], [3.0, 4.0]];
679 assert!(spca.fit(&x, &()).is_err());
680 }
681
682 #[test]
683 fn test_sparse_pca_n_components_too_large() {
684 let spca = SparsePCA::<f64>::new(5);
685 let x = array![[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]];
686 assert!(spca.fit(&x, &()).is_err());
687 }
688
689 #[test]
690 fn test_sparse_pca_insufficient_samples() {
691 let spca = SparsePCA::<f64>::new(1);
692 let x = array![[1.0, 2.0]];
693 assert!(spca.fit(&x, &()).is_err());
694 }
695
696 #[test]
697 fn test_sparse_pca_transform_shape_mismatch() {
698 let spca = SparsePCA::<f64>::new(1).with_random_state(0);
699 let x = array![[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]];
700 let fitted = spca.fit(&x, &()).unwrap();
701 let x_bad = array![[1.0, 2.0, 3.0]];
702 assert!(fitted.transform(&x_bad).is_err());
703 }
704
705 #[test]
706 fn test_sparse_pca_f32() {
707 let spca = SparsePCA::<f32>::new(1).with_random_state(0);
708 let x: Array2<f32> = array![[1.0f32, 2.0], [3.0, 4.0], [5.0, 6.0], [7.0, 8.0],];
709 let fitted = spca.fit(&x, &()).unwrap();
710 let projected = fitted.transform(&x).unwrap();
711 assert_eq!(projected.ncols(), 1);
712 }
713
714 #[test]
715 fn test_sparse_pca_mean_is_correct() {
716 let spca = SparsePCA::<f64>::new(1).with_random_state(0);
717 let x = array![[2.0, 4.0], [4.0, 6.0], [6.0, 8.0]];
718 let fitted = spca.fit(&x, &()).unwrap();
719 let mean = fitted.mean();
720 assert!((mean[0] - 4.0).abs() < 1e-10);
721 assert!((mean[1] - 6.0).abs() < 1e-10);
722 }
723
724 #[test]
725 fn test_sparse_pca_builder_methods() {
726 let spca = SparsePCA::<f64>::new(3)
727 .with_alpha(0.5)
728 .with_max_iter(500)
729 .with_tol(1e-6)
730 .with_random_state(99);
731 assert_eq!(spca.n_components(), 3);
732 assert!((spca.alpha() - 0.5).abs() < 1e-15);
733 assert_eq!(spca.max_iter(), 500);
734 assert!((spca.tol() - 1e-6).abs() < 1e-15);
735 }
736
737 #[test]
738 fn test_sparse_pca_n_iter_positive() {
739 let spca = SparsePCA::<f64>::new(1)
740 .with_max_iter(10)
741 .with_random_state(0);
742 let x = array![[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]];
743 let fitted = spca.fit(&x, &()).unwrap();
744 assert!(fitted.n_iter() > 0);
745 assert!(fitted.n_iter() <= 10);
746 }
747}