ferrolearn-cluster 0.5.0

//! Feature Agglomeration — hierarchical clustering of features.
//!
//! [`FeatureAgglomeration`] transposes the data and applies agglomerative
//! clustering to the *features* (columns) rather than the samples (rows).
//! After fitting, features within each cluster are pooled (by mean or max)
//! to produce a reduced-dimensionality representation.
//!
//! This is useful for supervised dimensionality reduction: correlated
//! features are grouped, and the group summary (e.g. mean) replaces
//! the original features.
//!
//! # Algorithm
//!
//! 1. **Fit**: transpose `X`, run agglomerative clustering on the columns
//!    (treated as data points) to obtain `n_clusters` feature groups.
//! 2. **Transform**: for each feature group, apply the pooling function
//!    (mean or max) across the grouped columns.  The output has
//!    `n_clusters` columns.
//!
//! # Examples
//!
//! ```
//! use ferrolearn_cluster::{FeatureAgglomeration, PoolingFunc};
//! use ferrolearn_core::traits::{Fit, Transform};
//! use ndarray::Array2;
//!
//! let x = Array2::from_shape_vec((4, 6), vec![
//!     1.0, 1.1, 5.0, 5.1, 9.0, 9.1,
//!     2.0, 2.1, 6.0, 6.1, 8.0, 8.1,
//!     3.0, 3.1, 7.0, 7.1, 7.0, 7.1,
//!     4.0, 4.1, 8.0, 8.1, 6.0, 6.1,
//! ]).unwrap();
//!
//! let fa = FeatureAgglomeration::<f64>::new(3);
//! let fitted = fa.fit(&x, &()).unwrap();
//! let reduced = fitted.transform(&x).unwrap();
//! assert_eq!(reduced.ncols(), 3);
//! ```
//!
//! # `## REQ status`
//!
//! Binary classification (R-DEFER-2): two states only — SHIPPED needs impl + a
//! non-test production consumer + green verification + symbol-anchor + sklearn
//! `file:line`; NOT-STARTED carries the open prereq blocker. **`FeatureAgglomeration`
//! has NO PyO3 binding** — `grep -rln FeatureAgglomeration ferrolearn-python/` is
//! EMPTY (no `_RsFeatureAgglomeration`, no `ferrolearn.FeatureAgglomeration`). The
//! non-test production consumer is therefore the crate re-export at the crate root
//! (`pub use feature_agglomeration::{AgglomerativeLinkage, FeatureAgglomeration,
//! FittedFeatureAgglomeration, PoolingFunc}` in `ferrolearn-cluster/src/lib.rs`),
//! exposing `fit` / `transform` / `labels()` (+ `feature_labels()` /
//! `children()` / `distances()` / `n_leaves()` / `n_connected_components()`).
//! **End-to-end VALUE parity is now achieved (post-#938):** the AgglomerativeClustering
//! unit ships bit-exact `_hc_cut` `labels_` (commit 3e001cf4b), and
//! `FeatureAgglomeration::fit` delegates to `AgglomerativeClustering::fit(X.T)`
//! (`_agglomerative.py:1339`), so the now-correct `labels_` flow through unchanged
//! (NO fit-side relabel: the inner `labels_` are stored verbatim into
//! `feature_labels_`). `feature_labels_`/`labels()` is integer-EXACT vs sklearn
//! `FeatureAgglomeration(...).fit(X).labels_` for all four linkages, k∈{2,3}; and
//! because `fn transform` groups by ASCENDING label index (it accumulates into
//! `result[[i, label]]`, column j = pool over features with `label == j`), the
//! `transform` output is now VALUE-EXACT and COLUMN-ORDERED vs sklearn (mean +
//! max), `_feature_agglomeration.py:51-64`. Fitted attrs `children_`/`distances_`/
//! `n_leaves_`/`n_connected_components_` are delegated from the inner fit over `X.T`.
//! Green verification = the in-tree `feature_agglom` lib tests plus the live-sklearn
//! pins/guards: `divergence_feature_agglom_min_features_two` (#944) +
//! `green_feature_agglom_*` (`tests/divergence_feature_agglomeration.rs`) and the new
//! VALUE tests (`tests/divergence_feature_agglomeration_value.rs`):
//! `value_labels_exact_all_linkages_k2_k3`, `value_transform_mean_column_ordered`,
//! `value_transform_max_column_ordered`, `value_fitted_attrs_delegated`. Cites use
//! symbol anchors (ferrolearn) / `file:line` (sklearn 1.5.2, commit 156ef14). Live
//! oracle = installed sklearn 1.5.2. (REQ numbering follows
//! `.design/cluster/feature_agglomeration.md`.)
//!
//! | REQ | Status | Evidence |
//! |---|---|---|
//! | REQ-4 (validation guards: `ensure_min_features=2`; `n_clusters >= 1`; `n_features >= n_clusters`; non-empty `X` — SHAPE/validation contract only) | SHIPPED (validation + shape + pooling-as-set contract only) | `fn fit` for `FeatureAgglomeration` rejects a 1-feature `X` with `FerroError::InvalidParameter { name: "X", reason: "Found array with {n} feature(s) while a minimum of 2 is required by FeatureAgglomeration" }`, mirroring `self._validate_data(X, ensure_min_features=2)` (`_agglomerative.py:1338`); rejects `n_clusters == 0` per `Interval(Integral, 1, None, closed="left")` (`:1281`); rejects `n_features < n_clusters` ("Cannot extract more clusters than samples", from `_hc_cut`); rejects empty `X` with `FerroError::InsufficientSamples`. The `Transform` impl (`fn transform`) returns output of SHAPE `(n_samples, n_clusters)`. Non-test consumer: crate re-export of `fit` / `transform` / `labels()` (`lib.rs`). Verified: now-PASSING pin `divergence_feature_agglom_min_features_two` (#944): `new(1).fit(X_(4,1))` returns `Err` (sklearn `FeatureAgglomeration(n_clusters=1).fit(X_1col)` raises `ValueError`); green guards `green_feature_agglom_transform_shape`, `green_feature_agglom_mean_pooling_as_set`, `green_feature_agglom_n_clusters_zero_rejected`, `green_feature_agglom_too_many_clusters_rejected`. GAP (NOT-STARTED): the sklearn `ValueError`/`InvalidParameterError` error ABI is NOT matched — ferrolearn uses `FerroError`. |
//! | REQ-1 (`transform` mean-pooling VALUE) | SHIPPED | `fn transform` (`PoolingFunc::Mean`) computes per-cluster mean (`sum/count`) into `result[[i, label]]`, column j ordered by ASCENDING label index, mirroring the `bincount` fast path of `AgglomerationTransform.transform` (`_feature_agglomeration.py:51-57`: `nX[i] = np.bincount(labels_, X[i,:]) / size`, column = label index 0..n_clusters-1). With the now-correct delegated `labels_` (#938 shipped, commit 3e001cf4b), the output is VALUE-EXACT and COLUMN-ORDERED vs sklearn. Non-test consumer: crate re-export (`lib.rs`); also consumed by `api_proof.rs`/`conformance_wave4.rs`. Verified: `value_transform_mean_column_ordered` — ferrolearn `transform(X)` == sklearn `FeatureAgglomeration(n_clusters=3, pooling_func=np.mean).fit(X).transform(X)` (full matrix, ~1e-9) for ward+single; sklearn row0 = `[1.05, 9.05, 5.05]`, ferrolearn row0 = `[1.05, 9.05, 5.05]`. |
//! | REQ-2 (`transform` max-pooling VALUE) | SHIPPED | `fn transform` (`PoolingFunc::Max`) takes per-cluster max into `result[[i, label]]`, ascending label-index column order, mirroring the general pooling path (`_feature_agglomeration.py:58-63`: `[pooling_func(X[:, labels_==l], axis=1) for l in np.unique(labels_)]`, columns by sorted unique label = 0..n_clusters-1). VALUE-EXACT and COLUMN-ORDERED post-#938. Non-test consumer: crate re-export (`lib.rs`). Verified: `value_transform_max_column_ordered` — full-matrix equality vs sklearn `pooling_func=np.max` (ward+complete); sklearn row0 = `[1.1, 9.1, 5.1]`, ferrolearn row0 = `[1.1, 9.1, 5.1]`. |
//! | REQ-3 (`labels_` feature-cluster VALUE parity) | SHIPPED | `fn fit` delegates to `AgglomerativeClustering::new(n_clusters).with_linkage(...)` on `X.T` and stores `fitted_agg.labels_` verbatim into `feature_labels_` (NO post-relabel), mirroring `super()._fit(X.T)` (`_agglomerative.py:1339`). The AgglomerativeClustering unit now ships bit-exact `_hc_cut` `labels_` (`agglomerative.rs fn hc_cut`, `:731-775`; `np.searchsorted` numbering `:1105`; commit 3e001cf4b), so `feature_labels_`/`labels()` is integer-EXACT: sklearn `[0,0,2,2,1,1]` (k=3) / `[1,1,0,0,0,0]` (k=2) == ferrolearn. Accessor `fn labels` (sklearn name) returns the same data as `fn feature_labels`. Non-test consumer: crate re-export (`lib.rs`). Verified: `value_labels_exact_all_linkages_k2_k3` — integer-exact for {ward,complete,average,single} × {2,3}. |
//! | REQ-5 (linkage variants ward/complete/average/single) | SHIPPED | `fn map_linkage` maps all four `AgglomerativeLinkage` variants to `Linkage`, delegated to `AgglomerativeClustering`, mirroring `_TREE_BUILDERS` (`_agglomerative.py:720-725`/`:1290`). VALUE parity (labels_ + transform) holds across all four post-#938. Non-test consumer: crate re-export (`lib.rs`). Verified: `value_labels_exact_all_linkages_k2_k3` (all four linkages) + `value_transform_mean_column_ordered` (ward+single) + `value_transform_max_column_ordered` (ward+complete). |
//! | REQ-6 (`n_clusters=2` default + missing params metric/memory/connectivity/compute_full_tree/distance_threshold/compute_distances) | NOT-STARTED | open prereq blocker **#941**. sklearn `__init__` (`_agglomerative.py:1296-1319`) takes 9 params with `n_clusters=2` default. `FeatureAgglomeration<F>` REQUIRES `n_clusters` (`fn new`, no default) and has only `linkage`/`pooling_func`. `distance_threshold` (cut by distance with `n_clusters=None`, `:1281`/`:1091-1092`) is materially absent. |
//! | REQ-7 (`pooling_func` as arbitrary callable) | NOT-STARTED | open prereq blocker **#941**. sklearn `_parameter_constraints["pooling_func"] = [callable]` (`_agglomerative.py:1291`) accepts any callable (default `np.mean`, `:1305`). ferrolearn offers only the closed `PoolingFunc::{Mean, Max}` enum (`fn with_pooling_func`). |
//! | REQ-8 (`inverse_transform`) | SHIPPED | `FittedFeatureAgglomeration::inverse_transform` broadcasts each cluster's pooled value back to its member features: `result[i, f] = xred[i, labels_[f]]`, mirroring sklearn `AgglomerationTransform.inverse_transform` (`_feature_agglomeration.py:66-92`: `unil, inverse = np.unique(labels_, return_inverse=True); return X[..., inverse]` — `inverse == labels_` for the contiguous `_hc_cut` numbering). Output shape `(n_samples, n_features)`; accepts `xred.ncols() >= n_clusters` (ignoring trailing extra columns, matching sklearn's numpy fancy-index `X[..., inverse]`, #2187) and rejects `xred.ncols() < n_clusters` with `FerroError::ShapeMismatch` (sklearn raises `IndexError`). Non-test consumer: crate re-export (`lib.rs`). Verified: `value_inverse_transform_roundtrip_and_broadcast` (full-matrix ~1e-12, all four linkages) + `divergence_inverse_transform_extra_cols_ignored` (#2187: `(2,4)` xred with n_clusters=3 ignores the 4th column, matching sklearn). |
//! | REQ-9 (fitted attrs labels_/n_leaves_/n_connected_components_/children_/distances_) | SHIPPED | `fn fit` stores the inner `AgglomerativeClustering::fit(X.T)` attributes into `FittedFeatureAgglomeration` (`children_`, `distances_`, `n_leaves_`, `n_connected_components_`), delegating exactly as sklearn `FeatureAgglomeration._fit` → `AgglomerativeClustering._fit(X.T)` (`_agglomerative.py:1339`, sets `labels_`/`n_clusters_`/`n_leaves_`/`n_connected_components_`/`children_`/`distances_` at `:1083-1095`). Accessors: `fn labels` (sklearn name, alias of `feature_labels`), `fn children`, `fn distances` (`Option`, `Some` iff `with_compute_distances(true)` — mirrors sklearn `compute_distances`, `:1319`), `fn n_leaves` (= `n_features`), `fn n_connected_components` (= 1 for the unstructured path). `with_compute_distances` passthrough sets the inner `compute_distances`. Non-test consumer: crate re-export (`lib.rs`). Verified: `value_fitted_attrs_delegated` — `children_` == sklearn `FeatureAgglomeration(compute_distances=True).fit(X).children_`; `distances_` == sklearn `distances_` (~1e-9); `n_leaves_ == 6 == n_features`; `n_connected_components_ == 1`. |
//! | REQ-10 (PyO3 binding) | NOT-STARTED | open prereq blocker **#943**. `grep -rln FeatureAgglomeration ferrolearn-python/` is EMPTY — no `_RsFeatureAgglomeration`, so `import ferrolearn` cannot reach `FeatureAgglomeration`. The only non-test consumer of `fit` / `transform` / `feature_labels()` is the crate re-export (`lib.rs`). |
//! | REQ-11 (ferray substrate) | NOT-STARTED | open prereq blocker **#943**. `feature_agglomeration.rs` imports `ndarray::{Array1, Array2}` + `num_traits::Float`; the delegated `agglomerative.rs` is likewise on `ndarray`. Not migrated to `ferray-core` / `ferray::linalg` (R-SUBSTRATE-1/2). |

use ferrolearn_core::error::FerroError;
use ferrolearn_core::traits::{Fit, Transform};
use ndarray::{Array1, Array2};
use num_traits::Float;

use crate::agglomerative::{AgglomerativeClustering, Linkage};

// ─────────────────────────────────────────────────────────────────────────────
// Public enums
// ─────────────────────────────────────────────────────────────────────────────

/// The linkage criterion used by [`FeatureAgglomeration`].
///
/// Re-uses the same linkage strategies as [`AgglomerativeClustering`].
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum AgglomerativeLinkage {
    /// Ward linkage (minimises within-cluster variance).
    Ward,
    /// Complete linkage (max pairwise distance).
    Complete,
    /// Average linkage (mean pairwise distance).
    Average,
    /// Single linkage (min pairwise distance).
    Single,
}

/// The pooling function applied to grouped features during transformation.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum PoolingFunc {
    /// Take the mean of features in each cluster.
    Mean,
    /// Take the maximum of features in each cluster.
    Max,
}

// ─────────────────────────────────────────────────────────────────────────────
// FeatureAgglomeration (unfitted)
// ─────────────────────────────────────────────────────────────────────────────

/// Feature Agglomeration configuration (unfitted).
///
/// Call [`Fit::fit`] to cluster the features and obtain a
/// [`FittedFeatureAgglomeration`].
///
/// # Type Parameters
///
/// - `F`: floating-point scalar type (`f32` or `f64`).
#[derive(Debug, Clone)]
pub struct FeatureAgglomeration<F> {
    /// Target number of feature clusters.
    n_clusters: usize,
    /// Linkage strategy for the agglomerative clustering of features.
    linkage: AgglomerativeLinkage,
    /// Pooling function applied during transformation.
    pooling_func: PoolingFunc,
    /// Whether to compute and store the per-merge `distances_` of the inner
    /// dendrogram, mirroring sklearn `compute_distances` (default `false`,
    /// `_agglomerative.py:1319`).
    compute_distances: bool,
    _marker: std::marker::PhantomData<F>,
}

impl<F: Float + Send + Sync + 'static> FeatureAgglomeration<F> {
    /// Create a new `FeatureAgglomeration` that reduces to `n_clusters` features.
    ///
    /// Defaults: `linkage = Ward`, `pooling_func = Mean`, `compute_distances = false`.
    #[must_use]
    pub fn new(n_clusters: usize) -> Self {
        Self {
            n_clusters,
            linkage: AgglomerativeLinkage::Ward,
            pooling_func: PoolingFunc::Mean,
            compute_distances: false,
            _marker: std::marker::PhantomData,
        }
    }

    /// Set the linkage criterion.
    #[must_use]
    pub fn with_linkage(mut self, linkage: AgglomerativeLinkage) -> Self {
        self.linkage = linkage;
        self
    }

    /// Set the pooling function.
    #[must_use]
    pub fn with_pooling_func(mut self, pooling: PoolingFunc) -> Self {
        self.pooling_func = pooling;
        self
    }

    /// Compute and store the per-merge linkage `distances_` of the inner
    /// dendrogram (over the transposed feature matrix), mirroring sklearn
    /// `FeatureAgglomeration(compute_distances=True)` (`_agglomerative.py:1319`).
    ///
    /// When set, [`FittedFeatureAgglomeration::distances`] returns `Some`.
    #[must_use]
    pub fn with_compute_distances(mut self, compute_distances: bool) -> Self {
        self.compute_distances = compute_distances;
        self
    }

    /// Return the configured number of feature clusters.
    #[must_use]
    pub fn n_clusters(&self) -> usize {
        self.n_clusters
    }

    /// Return the configured linkage.
    #[must_use]
    pub fn linkage(&self) -> AgglomerativeLinkage {
        self.linkage
    }

    /// Return the configured pooling function.
    #[must_use]
    pub fn pooling_func(&self) -> PoolingFunc {
        self.pooling_func
    }
}

// ─────────────────────────────────────────────────────────────────────────────
// FittedFeatureAgglomeration
// ─────────────────────────────────────────────────────────────────────────────

/// Fitted Feature Agglomeration model.
///
/// Stores the cluster label for each original feature and the pooling
/// strategy.  Implements [`Transform`] to reduce the dimensionality of
/// new data.
#[derive(Debug, Clone)]
pub struct FittedFeatureAgglomeration<F> {
    /// Cluster label for each original feature, length `n_features`.
    ///
    /// This is sklearn's `labels_` attribute: it is exactly the inner
    /// [`AgglomerativeClustering`]'s `labels_` over the transposed feature
    /// matrix `X.T` (`_agglomerative.py:1339`, `FeatureAgglomeration._fit`
    /// → `AgglomerativeClustering._fit(X.T)`).
    feature_labels_: Array1<usize>,
    /// Number of feature clusters.
    n_clusters_: usize,
    /// Number of original features.
    n_features_: usize,
    /// Pooling function to aggregate features within each cluster.
    pooling_func_: PoolingFunc,
    /// The merge history (dendrogram) of the inner clustering over `X.T`,
    /// length `n_features - 1`. Mirrors sklearn `children_`
    /// (`_agglomerative.py:1083-1095`, delegated from `AgglomerativeClustering`).
    children_: Vec<(usize, usize)>,
    /// The per-merge linkage distances of the inner clustering over `X.T`,
    /// in `children_` row order. `Some` when `compute_distances` was set.
    /// Mirrors sklearn `distances_`.
    distances_: Option<Array1<F>>,
    /// Number of leaves in the inner hierarchical tree (`== n_features` for the
    /// unstructured path). Mirrors sklearn `n_leaves_`.
    n_leaves_: usize,
    /// Estimated number of connected components in the inner graph (always `1`
    /// for the unstructured `connectivity=None` path). Mirrors sklearn
    /// `n_connected_components_`.
    n_connected_components_: usize,
    _marker: std::marker::PhantomData<F>,
}

impl<F: Float + Send + Sync + 'static> FittedFeatureAgglomeration<F> {
    /// Return the cluster label assigned to each feature.
    ///
    /// This is the sklearn name for the per-feature cluster assignment
    /// (`labels_`). Identical data to [`feature_labels`](Self::feature_labels).
    #[must_use]
    pub fn labels(&self) -> &Array1<usize> {
        &self.feature_labels_
    }

    /// Return the cluster label assigned to each feature.
    ///
    /// Alias of [`labels`](Self::labels) retained for backward compatibility;
    /// sklearn names this attribute `labels_`.
    #[must_use]
    pub fn feature_labels(&self) -> &Array1<usize> {
        &self.feature_labels_
    }

    /// Return the number of feature clusters.
    #[must_use]
    pub fn n_clusters(&self) -> usize {
        self.n_clusters_
    }

    /// Return the number of original features.
    #[must_use]
    pub fn n_features(&self) -> usize {
        self.n_features_
    }

    /// Return the merge history (dendrogram) of the inner clustering over `X.T`.
    ///
    /// Mirrors sklearn's `children_`, length `n_features - 1`. Delegated from
    /// the inner [`AgglomerativeClustering`] fit (`_agglomerative.py:1339`).
    #[must_use]
    pub fn children(&self) -> &[(usize, usize)] {
        &self.children_
    }

    /// Return the per-merge linkage distances of the inner dendrogram (in
    /// `children_` row order), or `None` if `compute_distances` was not set.
    ///
    /// Mirrors sklearn's `distances_` attribute.
    #[must_use]
    pub fn distances(&self) -> Option<&Array1<F>> {
        self.distances_.as_ref()
    }

    /// Return the number of leaves in the inner hierarchical tree
    /// (`== n_features` for the unstructured path). Mirrors sklearn `n_leaves_`.
    #[must_use]
    pub fn n_leaves(&self) -> usize {
        self.n_leaves_
    }

    /// Return the estimated number of connected components in the inner graph
    /// (always `1` for the unstructured `connectivity=None` path). Mirrors
    /// sklearn `n_connected_components_`.
    #[must_use]
    pub fn n_connected_components(&self) -> usize {
        self.n_connected_components_
    }
}

// ─────────────────────────────────────────────────────────────────────────────
// Map between linkage types
// ─────────────────────────────────────────────────────────────────────────────

/// Convert [`AgglomerativeLinkage`] to the internal [`Linkage`] enum.
fn map_linkage(l: AgglomerativeLinkage) -> Linkage {
    match l {
        AgglomerativeLinkage::Ward => Linkage::Ward,
        AgglomerativeLinkage::Complete => Linkage::Complete,
        AgglomerativeLinkage::Average => Linkage::Average,
        AgglomerativeLinkage::Single => Linkage::Single,
    }
}

// ─────────────────────────────────────────────────────────────────────────────
// Trait impls
// ─────────────────────────────────────────────────────────────────────────────

impl<F: Float + Send + Sync + 'static> Fit<Array2<F>, ()> for FeatureAgglomeration<F> {
    type Fitted = FittedFeatureAgglomeration<F>;
    type Error = FerroError;

    /// Fit feature agglomeration by clustering the features (columns).
    ///
    /// Transposes `x` so each column becomes a row (data point) and
    /// runs agglomerative clustering.
    ///
    /// # Errors
    ///
    /// - [`FerroError::InvalidParameter`] if `n_clusters == 0`.
    /// - [`FerroError::InsufficientSamples`] if `n_features < n_clusters`.
    fn fit(&self, x: &Array2<F>, _y: &()) -> Result<FittedFeatureAgglomeration<F>, FerroError> {
        let n_features = x.ncols();

        if self.n_clusters == 0 {
            return Err(FerroError::InvalidParameter {
                name: "n_clusters".into(),
                reason: "must be at least 1".into(),
            });
        }
        if n_features < 2 {
            return Err(FerroError::InvalidParameter {
                name: "X".into(),
                reason: format!(
                    "Found array with {n_features} feature(s) while a minimum of 2 is required by FeatureAgglomeration"
                ),
            });
        }
        if n_features < self.n_clusters {
            return Err(FerroError::InvalidParameter {
                name: "n_clusters".into(),
                reason: format!(
                    "n_clusters ({}) exceeds n_features ({})",
                    self.n_clusters, n_features
                ),
            });
        }
        if x.nrows() == 0 {
            return Err(FerroError::InsufficientSamples {
                required: 1,
                actual: 0,
                context: "FeatureAgglomeration::fit requires at least 1 sample".into(),
            });
        }

        // Transpose: each feature (column) becomes a row.
        // Use as_standard_layout() to ensure row-major (C) order, which
        // is required for AgglomerativeClustering's internal as_slice().
        let x_t = x.t().as_standard_layout().into_owned();

        // Run agglomerative clustering on the transposed data.
        //
        // sklearn `FeatureAgglomeration._fit` delegates verbatim to
        // `AgglomerativeClustering._fit(X.T)` (`_agglomerative.py:1339`), so
        // `labels_`/`children_`/`distances_`/`n_leaves_`/`n_connected_components_`
        // are exactly the inner estimator's attributes over `X.T`.
        let agg = AgglomerativeClustering::<F>::new(self.n_clusters)
            .with_linkage(map_linkage(self.linkage))
            .with_compute_distances(self.compute_distances);
        let fitted_agg = agg.fit(&x_t, &())?;

        Ok(FittedFeatureAgglomeration {
            feature_labels_: fitted_agg.labels_,
            n_clusters_: self.n_clusters,
            n_features_: n_features,
            pooling_func_: self.pooling_func,
            children_: fitted_agg.children_,
            distances_: fitted_agg.distances_,
            n_leaves_: fitted_agg.n_leaves_,
            n_connected_components_: fitted_agg.n_connected_components_,
            _marker: std::marker::PhantomData,
        })
    }
}

impl<F: Float + Send + Sync + 'static> Transform<Array2<F>> for FittedFeatureAgglomeration<F> {
    type Output = Array2<F>;
    type Error = FerroError;

    /// Transform data by pooling features within each cluster.
    ///
    /// The output has shape `(n_samples, n_clusters)`.
    ///
    /// # Errors
    ///
    /// Returns [`FerroError::ShapeMismatch`] if the number of columns does not
    /// match the number of features seen during fitting.
    fn transform(&self, x: &Array2<F>) -> Result<Array2<F>, FerroError> {
        if x.ncols() != self.n_features_ {
            return Err(FerroError::ShapeMismatch {
                expected: vec![x.nrows(), self.n_features_],
                actual: vec![x.nrows(), x.ncols()],
                context: "FittedFeatureAgglomeration::transform".into(),
            });
        }

        let n_samples = x.nrows();
        let mut result = Array2::<F>::zeros((n_samples, self.n_clusters_));

        match self.pooling_func_ {
            PoolingFunc::Mean => {
                // Count features per cluster.
                let mut counts = vec![0usize; self.n_clusters_];
                for &label in &self.feature_labels_ {
                    counts[label] += 1;
                }

                // Sum features per cluster.
                for i in 0..n_samples {
                    for (j, &label) in self.feature_labels_.iter().enumerate() {
                        result[[i, label]] = result[[i, label]] + x[[i, j]];
                    }
                }

                // Divide by count to get mean.
                for i in 0..n_samples {
                    for c in 0..self.n_clusters_ {
                        if counts[c] > 0 {
                            result[[i, c]] = result[[i, c]] / F::from(counts[c]).unwrap();
                        }
                    }
                }
            }
            PoolingFunc::Max => {
                // Initialize with negative infinity.
                result.fill(F::neg_infinity());

                for i in 0..n_samples {
                    for (j, &label) in self.feature_labels_.iter().enumerate() {
                        if x[[i, j]] > result[[i, label]] {
                            result[[i, label]] = x[[i, j]];
                        }
                    }
                }
            }
        }

        Ok(result)
    }
}

impl<F: Float + Send + Sync + 'static> FittedFeatureAgglomeration<F> {
    /// Inverse the pooling transformation: broadcast each cluster's pooled
    /// value back to every feature in that cluster.
    ///
    /// Mirrors sklearn `AgglomerationTransform.inverse_transform`
    /// (`_feature_agglomeration.py:66-92`): `unil, inverse =
    /// np.unique(labels_, return_inverse=True); return X[..., inverse]`. Since
    /// `labels_` is the contiguous `_hc_cut` numbering `0..n_clusters`,
    /// `inverse == labels_`, so output column `f` is reduced column
    /// `labels_[f]`: `result[i, f] = xred[i, labels_[f]]`.
    ///
    /// `xred` has shape `(n_samples, n_clusters)`; the result has shape
    /// `(n_samples, n_features)`.
    ///
    /// # Errors
    ///
    /// Returns [`FerroError::ShapeMismatch`] if `xred.ncols() < n_clusters`.
    /// Mirroring sklearn's numpy fancy-index `X[..., inverse]` (which reads only
    /// columns `0..n_clusters` and ignores any trailing columns, #2187), a wider
    /// `xred` is accepted and the extra columns are ignored; too few columns is
    /// the only error (sklearn raises `IndexError`).
    pub fn inverse_transform(&self, xred: &Array2<F>) -> Result<Array2<F>, FerroError> {
        if xred.ncols() < self.n_clusters_ {
            return Err(FerroError::ShapeMismatch {
                expected: vec![xred.nrows(), self.n_clusters_],
                actual: vec![xred.nrows(), xred.ncols()],
                context: "FittedFeatureAgglomeration::inverse_transform".into(),
            });
        }

        let n_samples = xred.nrows();
        let mut result = Array2::<F>::zeros((n_samples, self.n_features_));
        for i in 0..n_samples {
            for (f, &label) in self.feature_labels_.iter().enumerate() {
                result[[i, f]] = xred[[i, label]];
            }
        }
        Ok(result)
    }
}

// ─────────────────────────────────────────────────────────────────────────────
// Tests
// ─────────────────────────────────────────────────────────────────────────────

#[cfg(test)]
mod tests {
    use super::*;
    use approx::assert_abs_diff_eq;

    fn make_correlated_features() -> Array2<f64> {
        // 6 features: (0,1) correlated, (2,3) correlated, (4,5) correlated.
        Array2::from_shape_vec(
            (5, 6),
            vec![
                1.0, 1.1, 5.0, 5.1, 9.0, 9.1, 2.0, 2.1, 6.0, 6.1, 8.0, 8.1, 3.0, 3.1, 7.0, 7.1,
                7.0, 7.1, 4.0, 4.1, 8.0, 8.1, 6.0, 6.1, 5.0, 5.1, 9.0, 9.1, 5.0, 5.1,
            ],
        )
        .unwrap()
    }

    #[test]
    fn test_feature_agglom_basic() {
        let x = make_correlated_features();
        let fa = FeatureAgglomeration::<f64>::new(3);
        let fitted = fa.fit(&x, &()).unwrap();
        let reduced = fitted.transform(&x).unwrap();
        assert_eq!(reduced.dim(), (5, 3));
    }

    #[test]
    fn test_feature_agglom_output_shape() {
        let x = make_correlated_features();
        let fa = FeatureAgglomeration::<f64>::new(2);
        let fitted = fa.fit(&x, &()).unwrap();
        let reduced = fitted.transform(&x).unwrap();
        assert_eq!(reduced.ncols(), 2);
        assert_eq!(reduced.nrows(), 5);
    }

    #[test]
    fn test_feature_agglom_labels_valid_range() {
        let x = make_correlated_features();
        let fa = FeatureAgglomeration::<f64>::new(3);
        let fitted = fa.fit(&x, &()).unwrap();
        for &l in fitted.feature_labels() {
            assert!(l < 3, "label {l} out of range");
        }
    }

    #[test]
    fn test_feature_agglom_correlated_grouped() {
        // With 4 features merging to 2 clusters, pairs of nearly-identical
        // features should end up together.  Use Single linkage to guarantee
        // nearest-neighbor pairing.
        let x = Array2::from_shape_vec(
            (5, 4),
            vec![
                // feat 0  feat 1  feat 2   feat 3
                1.0, 1.001, 100.0, 100.001, 2.0, 2.001, 90.0, 90.001, 3.0, 3.001, 80.0, 80.001, 4.0,
                4.001, 70.0, 70.001, 5.0, 5.001, 60.0, 60.001,
            ],
        )
        .unwrap();
        let fa = FeatureAgglomeration::<f64>::new(2).with_linkage(AgglomerativeLinkage::Single);
        let fitted = fa.fit(&x, &()).unwrap();
        let labels = fitted.feature_labels();
        // Features 0 and 1 are nearly identical, should be in the same cluster.
        assert_eq!(labels[0], labels[1]);
        // Features 2 and 3 are nearly identical.
        assert_eq!(labels[2], labels[3]);
        // The two pairs should be in different clusters.
        assert_ne!(labels[0], labels[2]);
    }

    #[test]
    fn test_feature_agglom_mean_pooling() {
        // Simple case: two features that should be grouped.
        let x = Array2::from_shape_vec((3, 2), vec![2.0, 4.0, 6.0, 8.0, 10.0, 12.0]).unwrap();
        let fa = FeatureAgglomeration::<f64>::new(1);
        let fitted = fa.fit(&x, &()).unwrap();
        let reduced = fitted.transform(&x).unwrap();
        assert_eq!(reduced.ncols(), 1);
        // Mean of (2, 4) = 3, (6, 8) = 7, (10, 12) = 11.
        assert_abs_diff_eq!(reduced[[0, 0]], 3.0, epsilon = 1e-10);
        assert_abs_diff_eq!(reduced[[1, 0]], 7.0, epsilon = 1e-10);
        assert_abs_diff_eq!(reduced[[2, 0]], 11.0, epsilon = 1e-10);
    }

    #[test]
    fn test_feature_agglom_max_pooling() {
        let x = Array2::from_shape_vec((3, 2), vec![2.0, 4.0, 6.0, 8.0, 10.0, 12.0]).unwrap();
        let fa = FeatureAgglomeration::<f64>::new(1).with_pooling_func(PoolingFunc::Max);
        let fitted = fa.fit(&x, &()).unwrap();
        let reduced = fitted.transform(&x).unwrap();
        assert_eq!(reduced.ncols(), 1);
        // Max of (2, 4) = 4, (6, 8) = 8, (10, 12) = 12.
        assert_abs_diff_eq!(reduced[[0, 0]], 4.0, epsilon = 1e-10);
        assert_abs_diff_eq!(reduced[[1, 0]], 8.0, epsilon = 1e-10);
        assert_abs_diff_eq!(reduced[[2, 0]], 12.0, epsilon = 1e-10);
    }

    #[test]
    fn test_feature_agglom_complete_linkage() {
        let x = make_correlated_features();
        let fa = FeatureAgglomeration::<f64>::new(3).with_linkage(AgglomerativeLinkage::Complete);
        let fitted = fa.fit(&x, &()).unwrap();
        let reduced = fitted.transform(&x).unwrap();
        assert_eq!(reduced.ncols(), 3);
    }

    #[test]
    fn test_feature_agglom_average_linkage() {
        let x = make_correlated_features();
        let fa = FeatureAgglomeration::<f64>::new(3).with_linkage(AgglomerativeLinkage::Average);
        let fitted = fa.fit(&x, &()).unwrap();
        let reduced = fitted.transform(&x).unwrap();
        assert_eq!(reduced.ncols(), 3);
    }

    #[test]
    fn test_feature_agglom_single_linkage() {
        let x = make_correlated_features();
        let fa = FeatureAgglomeration::<f64>::new(3).with_linkage(AgglomerativeLinkage::Single);
        let fitted = fa.fit(&x, &()).unwrap();
        let reduced = fitted.transform(&x).unwrap();
        assert_eq!(reduced.ncols(), 3);
    }

    #[test]
    fn test_feature_agglom_n_clusters_equals_n_features() {
        let x = make_correlated_features();
        let fa = FeatureAgglomeration::<f64>::new(6);
        let fitted = fa.fit(&x, &()).unwrap();
        let reduced = fitted.transform(&x).unwrap();
        // No reduction; each feature is its own cluster.
        assert_eq!(reduced.ncols(), 6);
    }

    #[test]
    fn test_feature_agglom_zero_clusters_error() {
        let x = make_correlated_features();
        let fa = FeatureAgglomeration::<f64>::new(0);
        assert!(fa.fit(&x, &()).is_err());
    }

    #[test]
    fn test_feature_agglom_too_many_clusters_error() {
        let x = make_correlated_features();
        let fa = FeatureAgglomeration::<f64>::new(10); // only 6 features
        assert!(fa.fit(&x, &()).is_err());
    }

    #[test]
    fn test_feature_agglom_empty_data_error() {
        let x = Array2::<f64>::zeros((0, 4));
        let fa = FeatureAgglomeration::<f64>::new(2);
        assert!(fa.fit(&x, &()).is_err());
    }

    #[test]
    fn test_feature_agglom_transform_shape_mismatch() {
        let x = make_correlated_features();
        let fa = FeatureAgglomeration::<f64>::new(3);
        let fitted = fa.fit(&x, &()).unwrap();
        let x_bad =
            Array2::from_shape_vec((2, 4), vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]).unwrap();
        assert!(fitted.transform(&x_bad).is_err());
    }

    #[test]
    fn test_feature_agglom_f32() {
        let x = Array2::<f32>::from_shape_vec(
            (4, 4),
            vec![
                1.0, 1.1, 5.0, 5.1, 2.0, 2.1, 6.0, 6.1, 3.0, 3.1, 7.0, 7.1, 4.0, 4.1, 8.0, 8.1,
            ],
        )
        .unwrap();
        let fa = FeatureAgglomeration::<f32>::new(2);
        let fitted = fa.fit(&x, &()).unwrap();
        let reduced = fitted.transform(&x).unwrap();
        assert_eq!(reduced.ncols(), 2);
    }

    #[test]
    fn test_feature_agglom_getters() {
        let fa = FeatureAgglomeration::<f64>::new(3)
            .with_linkage(AgglomerativeLinkage::Complete)
            .with_pooling_func(PoolingFunc::Max);
        assert_eq!(fa.n_clusters(), 3);
        assert_eq!(fa.linkage(), AgglomerativeLinkage::Complete);
        assert_eq!(fa.pooling_func(), PoolingFunc::Max);
    }

    #[test]
    fn test_feature_agglom_n_features_getter() {
        let x = make_correlated_features();
        let fa = FeatureAgglomeration::<f64>::new(3);
        let fitted = fa.fit(&x, &()).unwrap();
        assert_eq!(fitted.n_features(), 6);
        assert_eq!(fitted.n_clusters(), 3);
    }
}