//! Spectral Clustering via graph Laplacian eigenmaps.
//!
//! This module provides [`SpectralClustering`], a graph-based clustering
//! algorithm that embeds the data into a low-dimensional space defined by the
//! eigenvectors of the graph Laplacian and then clusters the embedded points
//! with K-Means.
//!
//! # Algorithm
//!
//! 1. **Affinity matrix**: compute the pairwise RBF (Gaussian) kernel
//! `A[i,j] = exp(-gamma * ||x_i - x_j||^2)`.
//! 2. **Normalized graph Laplacian**: `L_sym = D^{-1/2} A D^{-1/2}`, where
//! `D[i,i] = sum_j A[i,j]` is the degree matrix.
//! 3. **Eigendecomposition**: compute the `n_clusters` eigenvectors of
//! `L_sym` that correspond to the **largest** eigenvalues (these are the
//! smoothest graph signals).
//! 4. **Row-normalize** the embedding matrix so each row has unit L2 norm.
//! 5. **K-Means** clustering on the embedded points with `n_init` restarts.
//!
//! Spectral Clustering does **not** implement [`ferrolearn_core::Predict`]
//! because there is no simple way to embed new points into the learned eigenspace
//! without refitting.
//!
//! # Notes
//!
//! The eigendecomposition is performed in `f64` via `ferrolearn_core::Backend`
//! (`NdarrayFaerBackend`) regardless of the input float type `F`, because
//! `faer`'s solver only supports `f64`. The results are then cast back to `F`.
//!
//! # Examples
//!
//! ```
//! use ferrolearn_cluster::SpectralClustering;
//! use ferrolearn_core::Fit;
//! use ndarray::Array2;
//!
//! let x = Array2::from_shape_vec((6, 2), vec![
//! 1.0_f64, 1.0, 1.1, 1.0, 1.0, 1.1,
//! 9.0, 9.0, 9.1, 9.0, 9.0, 9.1,
//! ]).unwrap();
//!
//! let model = SpectralClustering::<f64>::new(2).with_random_state(42);
//! let fitted = model.fit(&x, &()).unwrap();
//! assert_eq!(fitted.labels().len(), 6);
//! ```
//!
//! # `## REQ status`
//!
//! Binary classification (R-DEFER-2): two states only — SHIPPED needs impl + a
//! non-test production consumer + green verification; NOT-STARTED carries the
//! open prereq blocker. **`SpectralClustering` has NO PyO3 binding** — there is
//! no `_RsSpectralClustering` and no `ferrolearn.SpectralClustering` (confirmed
//! by grep: zero `SpectralClustering`/`RsSpectral` hits under
//! `ferrolearn-python/`). The non-test production consumer is therefore the
//! crate re-export at the crate root (`pub use
//! spectral::{FittedSpectralClustering, SpectralClustering}` in
//! `ferrolearn-cluster/src/lib.rs`), exposing `fit` / `fit_predict` /
//! `labels()`. **Honest underclaim (R-HONEST-3): this unit is a SIMPLIFIED
//! spectral-clustering VARIANT and does NOT achieve `SpectralClustering` label
//! parity** — the embedding diverges (ferrolearn row-L2-normalized top-k of
//! `D^{-1/2}AD^{-1/2}` vs sklearn `_spectral_embedding` bottom-k of
//! `I − D^{-1/2}AD^{-1/2}` scaled by `1/dd`), affinity/assign_labels modes and
//! most params are missing, and there is no binding. The ONLY contract that
//! VALUE-matches the live oracle with a real consumer is gamma/`n_clusters`
//! parameter validation (the gamma `[0, inf)` interval just landed via #930).
//! Green verification = the in-tree `spectral` lib tests plus the live-sklearn
//! pins / guards (`ferrolearn-cluster/tests/divergence_spectral.rs`): the RED
//! pin `divergence_spectral_gamma_zero_allowed` (#930), now PASSING after the
//! over-rejection fix; plus the green guards
//! `green_spectral_gamma_negative_rejected` (#930),
//! `green_spectral_n_clusters_zero_rejected`,
//! `green_spectral_insufficient_samples`. Cites use symbol anchors (ferrolearn)
//! / `file:line` (sklearn 1.5.2, commit 156ef14). Live oracle = installed
//! sklearn 1.5.2. (REQ numbering follows `.design/cluster/spectral.md`.)
//!
//! | REQ | Status | Evidence |
//! |---|---|---|
//! | REQ-7 (gamma `[0, inf)` validation: `gamma >= 0` accepted / `gamma < 0` rejected; `n_clusters >= 1`; `n_samples >= n_clusters`) | SHIPPED (validation contract only) | `fn fit` for `SpectralClustering` rejects ONLY `self.gamma < F::zero()` with `FerroError::InvalidParameter { name: "gamma", reason: "gamma must be >= 0 (sklearn Interval[0, inf))" }`, mirroring `SpectralClustering._parameter_constraints["gamma"] = Interval(Real, 0, None, closed="left")` (`_spectral.py:612`) — `gamma = 0.0` is INSIDE the closed-left interval so it is ACCEPTED (RBF collapses to all-ones affinity); `n_clusters == 0` rejected per `Interval(Integral, 1, None, closed="left")` (`_spectral.py:607`); `n_samples < n_clusters` rejected with `FerroError::InsufficientSamples`. Non-test consumer: crate re-export of `fit` / `fit_predict` / `labels()` (`lib.rs`). Verified: green pin `divergence_spectral_gamma_zero_allowed` (#930, now PASSING after the over-rejection fix): `with_gamma(0.0).fit(X, &())` returns `Ok` (sklearn `SpectralClustering(gamma=0.0).fit(X)` runs, NO error); green guards `green_spectral_gamma_negative_rejected` (`with_gamma(-1.0).fit` → `Err`), `green_spectral_n_clusters_zero_rejected` (`new(0).fit` → `Err`), `green_spectral_insufficient_samples` (`new(3).fit(X_1row)` → `Err`). GAP (NOT-STARTED, REQ-6): `n_clusters=8` default and the sklearn `InvalidParameterError`/`ValueError` error ABI are absent — ferrolearn requires `n_clusters` and uses `FerroError`. |
//! | REQ-1 (RBF affinity VALUE) | NOT-STARTED | open prereq blocker **#934**. `fn affinity_matrix` computes `exp(-gamma*||xi-xj||^2)` (diagonal forced `1.0`), VALUE-matching `pairwise_kernels(X, metric='rbf')` (`_spectral.py:730`) to full f64 precision (`rbf_kernel(blobs, gamma=0.1)[0,1] = 0.9950124791926823`). BUT `fn affinity_matrix` is a PRIVATE helper with no public accessor and no non-test consumer (no `affinity_matrix_`, REQ-8) — cannot be SHIPPED standalone (R-HONEST-2/R-DEFER-1). |
//! | REQ-2 (`labels_` VALUE parity) | NOT-STARTED | open prereq blocker **#929** (depends on #934 KMeans). `fit` / `fit_predict` produce `labels_` via a DIFFERENT pipeline (`fn normalized_laplacian` + `fn top_k_eigenvectors` + `fn row_normalize` + ferrolearn `KMeans`) than sklearn `_spectral_embedding` + `k_means` (`_spectral.py:756-766`). On `make_circles(…, random_state=0)` at `gamma=10`, sklearn vs ferrolearn `fit_predict` differ at indices 10,19 (2/30). Agreement on blobs / circles@0.1 is COINCIDENTAL (shared subspace) — NOT value parity. |
//! | REQ-3 (spectral embedding algorithm — `I − D^{-1/2}AD^{-1/2}` + `embedding/dd` + sign-flip) | NOT-STARTED | open prereq blocker **#929**. sklearn `_spectral_embedding` (`_spectral_embedding.py:300-469`): `L = csgraph_laplacian(A, normed=True) = I − D^{-1/2}AD^{-1/2}` (`:333`), SMALLEST-eigenvalue eigenvectors, `embedding = embedding / dd` (`:378`/`:443`), `_deterministic_vector_sign_flip` (`:465`). ferrolearn `fn normalized_laplacian` builds `D^{-1/2}AD^{-1/2}` (NO `I−`), `fn top_k_eigenvectors` takes the LARGEST, `fn row_normalize` does unit-L2 per row (NOT `/dd`), no sign flip — blobs `gamma=0.1` sklearn embedding row-magnitude ≈ `0.158` (`=1/dd`) vs ferrolearn `1.0`. The root-cause divergence. |
//! | REQ-4 (affinity modes nearest_neighbors/precomputed/poly/sigmoid) | NOT-STARTED | open prereq blocker **#931**. sklearn builds affinity via `kneighbors_graph` (`'nearest_neighbors'`, `:709-713`), passthrough (`'precomputed'`, `:720-721`), or any `pairwise_kernels` metric (`:730`). ferrolearn `fn affinity_matrix` is hard-coded RBF only; no `affinity` parameter. |
//! | REQ-5 (assign_labels discretize/cluster_qr) | NOT-STARTED | open prereq blocker **#932**. sklearn `assign_labels ∈ {'kmeans','discretize','cluster_qr'}` (`_spectral.py:625`, branch `:755-766`). ferrolearn `fn fit` hard-codes `KMeans`; no `assign_labels` parameter. |
//! | REQ-6 (missing params eigen_solver/n_components/eigen_tol/n_neighbors/degree/coef0/kernel_params/n_jobs/verbose/affinity + `n_clusters=8` default) | NOT-STARTED | open prereq blocker **#933**. sklearn `__init__` (`_spectral.py:633-666`) takes 16 params with `n_clusters=8` default (`:635`). `SpectralClustering<F>` (`fn new` + builders) has only `n_clusters` (REQUIRED, no default) / `gamma` / `n_init` / `random_state`. |
//! | REQ-8 (fitted attrs `affinity_matrix_`/`n_features_in_`) | NOT-STARTED | open prereq blocker **#934**. sklearn exposes `affinity_matrix_` + `n_features_in_` (`_spectral.py:524-538`). `FittedSpectralClustering<F>` stores only private `labels_` (+ `PhantomData`), exposes only `labels()`. No `affinity_matrix_` accessor — so the value-matching affinity (REQ-1) has no consumer. |
//! | REQ-9 (KMeans assign-labels parity) | NOT-STARTED | open prereq blocker **#934** (kmeans.rs unit). sklearn `k_means(maps, n_clusters, n_init, random_state)` uses k-means++ init with NumPy RNG (`_spectral.py:756`). ferrolearn `KMeans::new(k).with_n_init` is a separate unit with its own init/RNG — even a matched embedding would not guarantee identical labels. |
//! | REQ-10 (PyO3 binding) | NOT-STARTED | open prereq blocker **#935**. `grep -rln SpectralClustering ferrolearn-python/` is EMPTY — no `_RsSpectralClustering`, so `import ferrolearn` cannot reach `SpectralClustering`. The only non-test consumer of `fit` / `fit_predict` / `labels()` is the crate re-export (`lib.rs`). |
//! | REQ-11 (ferray substrate) | NOT-STARTED | open prereq blocker **#935**. `spectral.rs` imports `ndarray::{Array1, Array2}` + `num_traits::Float` + `ferrolearn_core::NdarrayFaerBackend` (`eigh`); not migrated to `ferray-core` / `ferray::linalg` (R-SUBSTRATE-1/2). |
//! | REQ-12 (reject non-finite input) | SHIPPED | `fn reject_non_finite` called at the top of `fn fit` (after the param/sample checks, before the RBF affinity / eigendecomposition) rejects NaN AND infinity with `FerroError::InvalidParameter{name:"X"}`, mirroring sklearn's `SpectralClustering.fit` → `_validate_data(force_all_finite=True)` default (`_spectral.py:691`), which raises `ValueError` (`validation.py:147-154`). ferrolearn previously rejected NaN (NaN affinities fail the eigendecomposition) but SILENTLY ACCEPTED +Inf; now both reject. Consumer: the existing `fit` entry — crate re-export `pub use spectral::{FittedSpectralClustering, SpectralClustering}` (`lib.rs`). Pinned by `divergence_nonfinite_reject_spillover.rs` (`divergence_spectral_fit_rejects_inf`) — live sklearn 1.5.2 raises, ferrolearn now `Err`. Finite input byte-identical (the module's oracle pins stay green). Closes #2286 for this estimator. |
use crate::kmeans::KMeans;
use ferrolearn_core::NdarrayFaerBackend;
use ferrolearn_core::backend::Backend;
use ferrolearn_core::error::FerroError;
use ferrolearn_core::traits::{Fit, Predict};
use ndarray::{Array1, Array2};
use num_traits::Float;
// ─────────────────────────────────────────────────────────────────────────────
// Configuration struct
// ─────────────────────────────────────────────────────────────────────────────
/// Spectral Clustering configuration (unfitted).
///
/// Holds hyperparameters. Call [`Fit::fit`] to run the algorithm and produce a
/// [`FittedSpectralClustering`].
///
/// # Type Parameters
///
/// - `F`: The floating-point type (`f32` or `f64`).
#[derive(Debug, Clone)]
pub struct SpectralClustering<F> {
/// Number of clusters.
pub n_clusters: usize,
/// RBF kernel parameter: `A[i,j] = exp(-gamma * ||x_i - x_j||^2)`.
/// Defaults to `1.0`.
pub gamma: F,
/// Number of K-Means restarts in the final clustering step.
pub n_init: usize,
/// Optional random seed for the K-Means restarts.
pub random_state: Option<u64>,
}
impl<F: Float> SpectralClustering<F> {
/// Create a new `SpectralClustering` with the given number of clusters.
///
/// Defaults: `gamma = 1.0`, `n_init = 10`, `random_state = None`.
#[must_use]
pub fn new(n_clusters: usize) -> Self {
Self {
n_clusters,
gamma: F::one(),
n_init: 10,
random_state: None,
}
}
/// Set the RBF kernel parameter `gamma`.
///
/// Must be non-negative (`>= 0`).
#[must_use]
pub fn with_gamma(mut self, gamma: F) -> Self {
self.gamma = gamma;
self
}
/// Set the number of K-Means restarts.
#[must_use]
pub fn with_n_init(mut self, n_init: usize) -> Self {
self.n_init = n_init;
self
}
/// Set the random seed for reproducibility.
#[must_use]
pub fn with_random_state(mut self, seed: u64) -> Self {
self.random_state = Some(seed);
self
}
}
// ─────────────────────────────────────────────────────────────────────────────
// Fitted struct
// ─────────────────────────────────────────────────────────────────────────────
/// Fitted Spectral Clustering model.
///
/// Stores the cluster labels for the training data.
///
/// Spectral Clustering does **not** implement [`ferrolearn_core::Predict`].
#[derive(Debug, Clone)]
pub struct FittedSpectralClustering<F> {
/// Cluster label for each training sample (0-indexed).
labels_: Array1<usize>,
/// Phantom for the float type.
_marker: std::marker::PhantomData<F>,
}
impl<F: Float> FittedSpectralClustering<F> {
/// Return the cluster labels for the training data.
#[must_use]
pub fn labels(&self) -> &Array1<usize> {
&self.labels_
}
}
// ─────────────────────────────────────────────────────────────────────────────
// Internal helpers
// ─────────────────────────────────────────────────────────────────────────────
/// Reject `X` containing any non-finite value (NaN or infinity).
///
/// Mirrors sklearn's `SpectralClustering.fit` →
/// `self._validate_data(X, accept_sparse=["csr","csc","coo"], dtype=np.float64, ensure_min_samples=2)`
/// (`sklearn/cluster/_spectral.py:691`), which keeps the `force_all_finite=True`
/// default and raises `ValueError("Input X contains NaN.")` /
/// `"... contains infinity ..."` (`sklearn/utils/validation.py:147-154`). NaN AND
/// infinity are both rejected up front so neither reaches the RBF affinity /
/// eigendecomposition. Never panics (R-CODE-2).
fn reject_non_finite<F: Float>(x: &Array2<F>) -> Result<(), FerroError> {
if x.iter().any(|v| !v.is_finite()) {
return Err(FerroError::InvalidParameter {
name: "X".into(),
reason: "Input X contains NaN or infinity.".into(),
});
}
Ok(())
}
/// Build the RBF affinity matrix in `f64`.
fn affinity_matrix<F: Float>(x: &Array2<F>, gamma: f64) -> Array2<f64> {
let n = x.nrows();
Array2::from_shape_fn((n, n), |(i, j)| {
if i == j {
1.0_f64
} else {
let sq: F = x
.row(i)
.iter()
.zip(x.row(j).iter())
.fold(F::zero(), |acc, (&a, &b)| acc + (a - b) * (a - b));
let sq64 = sq.to_f64().unwrap_or(0.0);
(-gamma * sq64).exp()
}
})
}
/// Compute the normalized symmetric graph Laplacian `D^{-1/2} A D^{-1/2}`.
fn normalized_laplacian(a: &Array2<f64>) -> Array2<f64> {
let n = a.nrows();
// Degree vector: d[i] = sum_j A[i,j]
let d: Vec<f64> = (0..n).map(|i| a.row(i).iter().sum()).collect();
// D^{-1/2}: avoid division by zero.
let d_inv_sqrt: Vec<f64> = d
.iter()
.map(|&di| if di > 0.0 { 1.0 / di.sqrt() } else { 0.0 })
.collect();
// L_sym[i,j] = d_inv_sqrt[i] * A[i,j] * d_inv_sqrt[j]
Array2::from_shape_fn((n, n), |(i, j)| d_inv_sqrt[i] * a[[i, j]] * d_inv_sqrt[j])
}
/// Extract the top-`k` eigenvectors (by eigenvalue magnitude) of a symmetric
/// matrix. Returns an `(n, k)` matrix.
fn top_k_eigenvectors(sym: &Array2<f64>, k: usize) -> Result<Array2<f64>, FerroError> {
let (eigenvalues, eigenvectors) = NdarrayFaerBackend::eigh(sym)?;
// `eigh` returns eigenvalues in non-decreasing order; the top-k are at the end.
let n = eigenvalues.len();
let start = n.saturating_sub(k);
let n_rows = eigenvectors.nrows();
let mut result = Array2::<f64>::zeros((n_rows, k));
for (new_col, old_col) in (start..n).enumerate() {
for row in 0..n_rows {
result[[row, new_col]] = eigenvectors[[row, old_col]];
}
}
Ok(result)
}
/// Row-normalize a matrix so each row has unit L2 norm.
/// Rows with zero norm are left as-is.
fn row_normalize(m: &Array2<f64>) -> Array2<f64> {
let (n, d) = m.dim();
Array2::from_shape_fn((n, d), |(i, j)| {
let norm: f64 = m.row(i).iter().map(|&v| v * v).sum::<f64>().sqrt();
if norm > 0.0 {
m[[i, j]] / norm
} else {
m[[i, j]]
}
})
}
// ─────────────────────────────────────────────────────────────────────────────
// Fit implementation
// ─────────────────────────────────────────────────────────────────────────────
impl<F: Float + Send + Sync + 'static> Fit<Array2<F>, ()> for SpectralClustering<F> {
type Fitted = FittedSpectralClustering<F>;
type Error = FerroError;
/// Fit the Spectral Clustering model to the data.
///
/// # Errors
///
/// - [`FerroError::InvalidParameter`] if `n_clusters == 0`, `gamma < 0`,
/// or `n_init == 0`.
/// - [`FerroError::InsufficientSamples`] if `n_samples < n_clusters`.
/// - [`FerroError::NumericalInstability`] if the eigendecomposition fails.
fn fit(&self, x: &Array2<F>, _y: &()) -> Result<FittedSpectralClustering<F>, FerroError> {
let n_samples = x.nrows();
// Validate parameters.
if self.n_clusters == 0 {
return Err(FerroError::InvalidParameter {
name: "n_clusters".into(),
reason: "must be at least 1".into(),
});
}
if self.gamma < F::zero() {
return Err(FerroError::InvalidParameter {
name: "gamma".into(),
reason: "gamma must be >= 0 (sklearn Interval[0, inf))".into(),
});
}
if n_samples == 0 {
return Err(FerroError::InsufficientSamples {
required: self.n_clusters,
actual: 0,
context: "SpectralClustering requires at least n_clusters samples".into(),
});
}
if n_samples < self.n_clusters {
return Err(FerroError::InsufficientSamples {
required: self.n_clusters,
actual: n_samples,
context: "SpectralClustering requires at least n_clusters samples".into(),
});
}
// Reject non-finite X up front (NaN AND Inf), mirroring sklearn's
// `_validate_data(force_all_finite=True)` reached from
// `SpectralClustering.fit` (`_spectral.py:691`), which raises
// `ValueError` (R-DEV-1, R-CODE-2). ferrolearn previously rejected NaN
// (NaN affinities propagate to a failed eigendecomposition) but
// SILENTLY ACCEPTED +Inf — this rejects both.
reject_non_finite(x)?;
let gamma64 = self.gamma.to_f64().unwrap_or(1.0);
// Step 1: affinity matrix.
let aff = affinity_matrix(x, gamma64);
// Step 2: normalized Laplacian.
let lap = normalized_laplacian(&aff);
// Step 3: top-k eigenvectors.
let k = self.n_clusters;
let embed = top_k_eigenvectors(&lap, k)?;
// Step 4: row-normalize.
let embed_norm = row_normalize(&embed);
// Step 5: K-Means on the embedded points.
// Convert embedding to F.
let embed_f: Array2<F> = Array2::from_shape_fn(embed_norm.dim(), |(i, j)| {
F::from(embed_norm[[i, j]]).unwrap_or_else(F::zero)
});
let mut km = KMeans::<F>::new(k).with_n_init(self.n_init);
if let Some(seed) = self.random_state {
km = km.with_random_state(seed);
}
let fitted_km = km.fit(&embed_f, &())?;
let labels = fitted_km.predict(&embed_f)?;
Ok(FittedSpectralClustering {
labels_: labels,
_marker: std::marker::PhantomData,
})
}
}
impl<F: Float + Send + Sync + 'static> SpectralClustering<F> {
/// Fit on `x` and return the cluster labels for those samples in one
/// call. Equivalent to sklearn `ClusterMixin.fit_predict`.
///
/// # Errors
///
/// Forwards any error from [`Fit::fit`].
pub fn fit_predict(&self, x: &Array2<F>) -> Result<Array1<usize>, FerroError> {
let fitted = self.fit(x, &())?;
Ok(fitted.labels().clone())
}
}
// ─────────────────────────────────────────────────────────────────────────────
// Tests
// ─────────────────────────────────────────────────────────────────────────────
#[cfg(test)]
mod tests {
use super::*;
/// Two well-separated 2-D blobs.
fn two_blobs() -> Array2<f64> {
Array2::from_shape_vec(
(10, 2),
vec![
0.0, 0.0, 0.2, 0.1, -0.1, 0.2, 0.1, -0.1, 0.0, 0.1, 10.0, 10.0, 10.2, 10.1, 9.9,
10.2, 10.1, 9.9, 10.0, 10.1,
],
)
.unwrap()
}
#[test]
fn test_two_blobs_two_clusters() {
let x = two_blobs();
let model = SpectralClustering::<f64>::new(2)
.with_gamma(0.1)
.with_random_state(42);
let fitted = model.fit(&x, &()).unwrap();
let labels = fitted.labels();
assert_eq!(labels.len(), 10);
// Points 0-4 should share a label.
assert_eq!(labels[0], labels[1]);
assert_eq!(labels[0], labels[2]);
assert_eq!(labels[0], labels[3]);
assert_eq!(labels[0], labels[4]);
// Points 5-9 should share a different label.
assert_eq!(labels[5], labels[6]);
assert_eq!(labels[5], labels[7]);
assert_eq!(labels[5], labels[8]);
assert_eq!(labels[5], labels[9]);
assert_ne!(labels[0], labels[5]);
}
#[test]
fn test_labels_length_matches_n_samples() {
let x = two_blobs();
let fitted = SpectralClustering::<f64>::new(2)
.with_random_state(0)
.fit(&x, &())
.unwrap();
assert_eq!(fitted.labels().len(), x.nrows());
}
#[test]
fn test_labels_in_valid_range() {
let x = two_blobs();
let k = 2usize;
let fitted = SpectralClustering::<f64>::new(k)
.with_random_state(1)
.fit(&x, &())
.unwrap();
for &l in fitted.labels() {
assert!(l < k, "label {l} >= n_clusters {k}");
}
}
#[test]
fn test_single_cluster() {
let x = two_blobs();
let fitted = SpectralClustering::<f64>::new(1)
.with_random_state(0)
.fit(&x, &())
.unwrap();
for &l in fitted.labels() {
assert_eq!(l, 0);
}
}
#[test]
fn test_invalid_n_clusters_zero() {
let x = two_blobs();
let result = SpectralClustering::<f64>::new(0).fit(&x, &());
assert!(result.is_err());
}
#[test]
fn test_gamma_zero_allowed() {
// sklearn `gamma: Interval(Real, 0, None, closed="left")`
// (_spectral.py:612) → [0.0, inf), so gamma=0.0 is ACCEPTED (RBF
// collapses to an all-ones affinity). Mirror that contract (R-HONEST-4):
// the prior assertion `is_err()` pinned a non-sklearn over-rejection.
let x = two_blobs();
let result = SpectralClustering::<f64>::new(2)
.with_gamma(0.0)
.fit(&x, &());
assert!(result.is_ok());
}
#[test]
fn test_invalid_gamma_negative() {
let x = two_blobs();
let result = SpectralClustering::<f64>::new(2)
.with_gamma(-1.0)
.fit(&x, &());
assert!(result.is_err());
}
#[test]
fn test_empty_data_error() {
let x = Array2::<f64>::zeros((0, 2));
let result = SpectralClustering::<f64>::new(2).fit(&x, &());
assert!(result.is_err());
}
#[test]
fn test_insufficient_samples_error() {
let x = Array2::from_shape_vec((1, 2), vec![0.0, 0.0]).unwrap();
let result = SpectralClustering::<f64>::new(3).fit(&x, &());
assert!(result.is_err());
}
#[test]
fn test_n_clusters_equals_n_samples() {
// k == n should be valid (each point its own cluster).
let x = Array2::from_shape_vec((3, 2), vec![0.0, 0.0, 5.0, 5.0, 10.0, 10.0]).unwrap();
let fitted = SpectralClustering::<f64>::new(3)
.with_random_state(0)
.fit(&x, &())
.unwrap();
assert_eq!(fitted.labels().len(), 3);
}
#[test]
fn test_f32_support() {
let x = Array2::from_shape_vec(
(6, 2),
vec![
0.0f32, 0.0, 0.1, 0.1, -0.1, 0.1, 10.0, 10.0, 10.1, 10.1, 9.9, 10.1,
],
)
.unwrap();
let fitted = SpectralClustering::<f32>::new(2)
.with_gamma(0.1)
.with_random_state(42)
.fit(&x, &())
.unwrap();
assert_eq!(fitted.labels().len(), 6);
}
#[test]
fn test_reproducibility_with_seed() {
let x = two_blobs();
let model = SpectralClustering::<f64>::new(2)
.with_gamma(0.1)
.with_random_state(7);
let fitted1 = model.fit(&x, &()).unwrap();
let fitted2 = model.fit(&x, &()).unwrap();
assert_eq!(fitted1.labels(), fitted2.labels());
}
}