ferrolearn-cluster 0.5.0

//! Agglomerative (bottom-up) hierarchical clustering.
//!
//! This module provides [`AgglomerativeClustering`], a hierarchical clustering
//! algorithm that builds a dendrogram by successively merging the two closest
//! clusters.  The merge criterion is determined by the [`Linkage`] strategy.
//!
//! # Algorithm
//!
//! This is a bit-exact translation of scikit-learn's *unstructured*
//! (`connectivity=None`) path, which delegates the dendrogram construction to
//! SciPy (`sklearn/cluster/_agglomerative.py:298-321, 532-592`):
//!
//! * **Ward / Complete / Average** use the **nearest-neighbour chain**
//!   (`nn-chain`) algorithm (`scipy.cluster.hierarchy.linkage`), producing a set
//!   of `n-1` merges. The merges are then **stably sorted by merge distance** and
//!   relabelled through a **union-find** so the resulting `children_` matches
//!   `scipy.cluster.hierarchy.linkage(X, method, 'euclidean')[:, :2]` bit-for-bit.
//! * **Single** uses **Prim's MST** (`mst_linkage_core`,
//!   `_hierarchical_fast.pyx`) sorted by weight, then a union-find relabel
//!   (`_single_linkage_label`) that — unlike the other linkages — does **not**
//!   reorder the merged pair.
//!
//! The full dendrogram has node IDs `0..n-1` for the leaves and `n+i` for the
//! cluster formed by the `i`-th sorted merge. `children_` therefore has shape
//! `(n_samples - 1, 2)` regardless of `n_clusters` (the FULL tree), exactly like
//! sklearn's fitted `children_` attribute.
//!
//! `labels_` are then produced by cutting the full tree with
//! [`_hc_cut`](https://github.com/scikit-learn/scikit-learn/blob/1.5.2/sklearn/cluster/_agglomerative.py#L731):
//! a negated-id max-heap pops the top `n_clusters` dendrogram nodes and numbers
//! each leaf by the heap position of its ancestor.
//!
//! The overall complexity is **O(n³)** in time and **O(n²)** in space, which
//! is practical for datasets up to a few thousand samples.
//!
//! # Linkage strategies
//!
//! | [`Linkage`]  | Distance formula | Properties |
//! |--------------|------------------|------------|
//! | `Single`     | `min d(a, b)`    | Chaining effect; handles non-convex shapes |
//! | `Complete`   | `max d(a, b)`    | Compact clusters |
//! | `Average`    | mean of pairwise | Compromise |
//! | `Ward`       | increase in SSE  | Minimises within-cluster variance |
//!
//! # Note
//!
//! [`AgglomerativeClustering`] implements [`Fit`] only.  There is no
//! `predict` method (mirroring scikit-learn's design).
//!
//! # Examples
//!
//! ```
//! use ferrolearn_cluster::AgglomerativeClustering;
//! use ferrolearn_core::Fit;
//! use ndarray::Array2;
//!
//! let x = Array2::from_shape_vec((6, 2), vec![
//!     0.0, 0.0,  0.1, 0.0,  0.0, 0.1,
//!     8.0, 8.0,  8.1, 8.0,  8.0, 8.1,
//! ]).unwrap();
//!
//! let model = AgglomerativeClustering::<f64>::new(2);
//! let fitted = model.fit(&x, &()).unwrap();
//! assert_eq!(fitted.labels().len(), 6);
//! ```
//!
//! # `## REQ status`
//!
//! Binary (R-DEFER-2), translating `sklearn/cluster/_agglomerative.py`
//! (`class AgglomerativeClustering(ClusterMixin, BaseEstimator)` `:781`).
//! Design doc: `.design/cluster/agglomerative.md`. Cites use ferrolearn symbol
//! anchors / sklearn `file:line` (commit 156ef14); expected values from the live
//! sklearn 1.5.2 oracle (R-CHAR-3). The `labels_` PARTITION (REQ-1), the full
//! `children_` dendrogram (REQ-6) and the absolute `_hc_cut` `labels_` numbering
//! (REQ-7) all SHIP bit-exact against scipy/sklearn 1.5.2 for the unstructured
//! (`connectivity=None`) case across all four linkages (the #938 structural
//! carve-out), via the nn-chain / MST dendrogram builder + `_hc_cut`. The
//! consumers `birch.rs` / `feature_agglomeration.rs` use `labels()` purely as a
//! partition and are unaffected by the (now sklearn-exact) renumbering.
//!
//! | REQ | Status | Evidence |
//! |---|---|---|
//! | REQ-1 (`labels_` PARTITION, separable data) | SHIPPED | impl `fn agglomerate` (`fn full_dendrogram` → `fn hc_cut`) → `Fit::fit` builds `labels_`, mirroring sklearn `_fit` (`_agglomerative.py:1083-1105`). Now ABSOLUTE-value-matches the oracle for all 4 linkages (was up-to-permutation; see REQ-7). Consumers: `_RsAgglomerativeClustering` (`ferrolearn-python/src/extras.rs`), `birch.rs fn fit`, `feature_agglomeration.rs fn fit`. Guards: `green_two_blobs_partition_all_linkages`, `green_three_blobs_partition_all_linkages` in `tests/divergence_agglomerative.rs`. |
//! | REQ-2 (`n_clusters_` == requested) | SHIPPED | `Fit::fit` sets `n_clusters_: self.n_clusters`, mirroring `self.n_clusters_ = self.n_clusters` when `distance_threshold is None` (`_agglomerative.py:1095`). Guards above assert `n_clusters_` 2 and 3; `n_clusters()` accessor consumed by `birch.rs`/`feature_agglomeration.rs`. |
//! | REQ-3 (four linkage criteria) | SHIPPED | `enum Linkage` + `fn lance_williams` (Single=min, Complete=max, Average=size-weighted mean, Ward=Ward update on Euclidean distances) in the `fn nn_chain` builder, with `fn mst_single`+`fn single_linkage_relabel` for single, mirror `_TREE_BUILDERS` (`_agglomerative.py:720-725`). Now the `children_`/`labels_` BIT-EXACT-match the oracle for all four (REQ-6/REQ-7), not just the partition. Guards above + the parity tests in REQ-6/REQ-7. |
//! | REQ-4 (`n_clusters=2` ctor default + sklearn error ABI) | NOT-STARTED | open prereq blocker #963. sklearn `__init__` defaults `n_clusters=2` (`_agglomerative.py:951`); ferrolearn `fn new(n_clusters)` requires it. Validation errors are `FerroError::InvalidParameter`/`InsufficientSamples` (crate-wide port convention), not sklearn's `ValueError`/`InvalidParameterError`. |
//! | REQ-5 (`ensure_min_samples=2` validation) | NOT-STARTED | open prereq blocker #964. sklearn `fit` → `_validate_data(X, ensure_min_samples=2)` (`_agglomerative.py:989`) rejects `n_samples < 2`; ferrolearn `fn fit` accepts a single sample when `n_clusters <= 1` (`test_single_sample_single_cluster`). Coupled fix: `birch.rs fn fit` calls `AgglomerativeClustering::new(1)` on a 1-row matrix in the single-subcluster path, so this is a multi-file change, not minimal in `agglomerative.rs` alone. |
//! | REQ-6 (`children_` full-dendrogram format) | SHIPPED | impl `fn full_dendrogram in agglomerative.rs` (nn-chain `fn nn_chain` + stable distance sort + union-find `fn union_find_relabel` for ward/complete/average; Prim MST `fn mst_single` + `fn single_linkage_relabel` for single) produces `children_` of shape `(n_samples-1, 2)` with leaves `0..n-1` and internal-node IDs `n+i`, BIT-EXACT-equal to `scipy.cluster.hierarchy.linkage(X, method, 'euclidean')[:, :2]` for ward/complete/average (`_agglomerative.py:314`/`:586`); for `single`, sklearn uses `mst_linkage_core` + `_single_linkage_label` (`:567-584`, R-DEV-7) whose pair order differs from `scipy.linkage('single')`, so `children_` matches sklearn's OWN `AgglomerativeClustering.children_` bit-exact. Live-oracle tests in `tests/divergence_agglomerative_dendrogram.rs`: `children_exact_scipy_6pt_nn_chain_linkages`, `children_exact_scipy_10pt_nn_chain_linkages`, `children_exact_sklearn_single_6pt`, `children_exact_sklearn_single_10pt`, and the pinned `divergence_children_full_dendrogram_format`. Consumers: `Fit::fit` → `children_`, `RsAgglomerativeClustering`, `birch.rs`/`feature_agglomeration.rs`. |
//! | REQ-7 (`labels_` ABSOLUTE numbering via `_hc_cut`) | SHIPPED | impl `fn hc_cut in agglomerative.rs` (negated-id max-heap over the top-`n_clusters` dendrogram nodes + `fn hc_get_descendent`, mirroring `_hc_cut`, `_agglomerative.py:731-775`) builds `labels_` from the REQ-6 full `children_`, BIT-EXACT-equal to `sklearn.cluster.AgglomerativeClustering(n_clusters=k, linkage=…).fit(X).labels_`. Live-oracle tests: `labels_exact_sklearn_6pt_all_linkages`/`labels_exact_sklearn_10pt_all_linkages` (k∈{2,3}, all four linkages) + the pinned `divergence_labels_absolute_hc_cut_numbering`. Consumer: `Fit::fit` → `labels_` surfaced through `RsAgglomerativeClustering::labels_` + `birch.rs`/`feature_agglomeration.rs` (partition use). |
//! | REQ-8 (`metric` / `connectivity`) | NOT-STARTED | open prereq blocker #965. sklearn `metric` ∈ {euclidean,l1,l2,manhattan,cosine,precomputed} default `'euclidean'` with the ward-requires-euclidean rule (`_agglomerative.py:795-799`, `:1034-1038`) and `connectivity` for structured clustering (`:812-822`). ferrolearn `fn sq_euclidean`/`fn pairwise_sq_dists` are Euclidean-only, unstructured. |
//! | REQ-9 (`distance_threshold`/`compute_distances`/`distances_`) | SHIPPED | ctor `n_clusters: Option<usize>` (default `Some(2)`) + `distance_threshold: Option<F>` + `with_distance_threshold` (clears `n_clusters`, mirroring `n_clusters=None,distance_threshold=t`) + `with_compute_distances`; `Fit::fit` enforces the XOR `not ((n_clusters is None) ^ (distance_threshold is None))` (`_agglomerative.py:1022-1027`) as `FerroError::InvalidParameter`. `fn full_dendrogram`/`fn union_find_relabel`/`fn single_linkage_relabel` now also return the per-merge distances in `children_` row order; `fn agglomerate` surfaces them as `distances_` when `return_distance = distance_threshold.is_some() || compute_distances` (`:1074`, `:1087-1088`) and derives `n_clusters_ = count(distances_ >= t) + 1` (`:1090-1093`) then `labels_ = hc_cut(n_clusters_, …)` (`:1099`). `distances_` EXACT-equals `scipy.linkage(X, L, 'euclidean')[:, 2]` / sklearn `distances_` (~1e-12, all 4 linkages). Live-oracle tests `tests/divergence_agglomerative_threshold.rs`: `distances_exact_6pt_all_linkages`, `distances_exact_10pt_all_linkages`, `threshold_cut_exact_6pt_all_linkages`, `threshold_cut_exact_10pt_all_linkages`, `threshold_inclusive_boundary_6pt_ward`, `xor_*`. Consumers: `Fit::fit` → `distances_`/`n_clusters_`/`labels_`, `RsAgglomerativeClustering`, `birch.rs`/`feature_agglomeration.rs` (via `new()` default `Some(2)`). NOTE `compute_full_tree` is a no-op here: the unstructured path always builds the FULL dendrogram, so the `'auto'`/partial-tree PERF optimisation is unimplemented and not observable — every observable attribute matches the full-tree path. `memory` caching stays NOT-STARTED. |
//! | REQ-10 (`n_leaves_`/`n_connected_components_`) | SHIPPED | `FittedAgglomerativeClustering` now stores `pub n_leaves_: usize` / `pub n_connected_components_: usize` with accessors `n_leaves()`/`n_connected_components()`; `Fit::fit` sets `n_leaves_ = n_samples`, `n_connected_components_ = 1` for the unstructured (`connectivity=None`) path (`_agglomerative.py:1083-1085`: `ward_tree`/`linkage_tree` return `(children_, 1, n_samples, None[, distances])`). Live-oracle test `n_leaves_and_connected_components_unstructured` (`tests/divergence_agglomerative_threshold.rs`, both fixtures, all 4 linkages). Consumer: `Fit::fit` (production path) + the accessors. `memory` caching (`:1006`/`:1076`) stays NOT-STARTED (open blocker #967): it is an opt-in perf/persistence wrapper with no observable attribute divergence in the default `memory=None` path. |
//! | REQ-11 (PyO3 binding parity) | SHIPPED | `#[pyclass(name="_RsAgglomerativeClustering")]` (`ferrolearn-python/src/extras.rs`): `fn new(n_clusters=2)`, `fn fit` → `ferrolearn_cluster::AgglomerativeClustering::<f64>::new(self.n_clusters)`, `#[getter] labels_`; registered in `ferrolearn-python/src/lib.rs`, wrapped `class AgglomerativeClustering(_ClusterWrapper)` in `python/ferrolearn/_extras.py`, exported in `__init__.py`. `ferrolearn.AgglomerativeClustering(n_clusters=2).fit(X).labels_` matches sklearn up to label permutation (REQ-1). Hard-wires Ward (no `linkage`/`metric`/`distance_threshold` arg). |
//! | REQ-12 (ferray substrate) | NOT-STARTED | open prereq blocker #968. `agglomerative.rs` imports `ndarray::{Array1, Array2}` + `num_traits::Float`, not `ferray-core`; the PyO3 boundary uses `numpy2_to_ndarray`, not `ferray::numpy_interop` (R-SUBSTRATE). |
//! | REQ-13 (reject non-finite input — R-CODE-2 panic fix) | SHIPPED | impl `Fit::fit` checks `x.iter().any(\|v\| !v.is_finite())` BEFORE the `agglomerate(...)` call and returns `Err(FerroError::InvalidParameter{name:"X"})`, rejecting NaN AND infinity. This eliminates the #2282 PANIC: NaN/Inf distances previously mis-routed the nn-chain so `condensed_index`'s `usize` arithmetic underflowed (`agglomerative.rs` `fn condensed_index`). Mirrors sklearn's `_validate_data(X, ensure_min_samples=2)` (`force_all_finite=True` default) reached from `AgglomerativeClustering.fit` (`_agglomerative.py:989`), which raises `ValueError` (`validation.py:147-154`). Consumers: the existing `fit` entry — PyO3 `RsAgglomerativeClustering` + crate re-export + `birch.rs`/`feature_agglomeration.rs`. Pinned by `divergence_nonfinite_panic.rs` (`divergence_agglomerative_fit_nan_no_panic`, `divergence_agglomerative_fit_inf_no_panic`: fit(NaN/Inf) is a clean `Err`, no abort) — live sklearn 1.5.2 raises, ferrolearn now `Err`. Finite input byte-identical (the threshold/dendrogram oracle pins stay green). |

use ferrolearn_core::error::FerroError;
use ferrolearn_core::traits::Fit;
use ndarray::{Array1, Array2};
use num_traits::Float;

// ─────────────────────────────────────────────────────────────────────────────
// Public enums & configuration struct
// ─────────────────────────────────────────────────────────────────────────────

/// The linkage criterion used to measure distances between clusters.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum Linkage {
    /// Ward linkage: merge the pair that minimises the increase in
    /// within-cluster sum of squared errors.
    Ward,
    /// Complete linkage: the distance between two clusters is the
    /// *maximum* distance between any pair of their members.
    Complete,
    /// Average linkage (UPGMA): the distance is the mean of all pairwise
    /// distances between the two clusters.
    Average,
    /// Single linkage: the distance between two clusters is the *minimum*
    /// pairwise distance between their members.
    Single,
}

/// Agglomerative clustering configuration (unfitted).
///
/// Call [`Fit::fit`] to run the algorithm and obtain a
/// [`FittedAgglomerativeClustering`].
///
/// # Cluster target: `n_clusters` XOR `distance_threshold`
///
/// Mirroring sklearn's `__init__` (`_agglomerative.py:949-960`), the number of
/// clusters is expressed as EITHER a fixed `n_clusters` OR a
/// `distance_threshold` (the linkage distance above which clusters are not
/// merged). Exactly one must be set — sklearn enforces this as the XOR
/// `not ((n_clusters is None) ^ (distance_threshold is None))`
/// (`_agglomerative.py:1022-1027`). Rust has no `None`-vs-`int` overload, so we
/// store `n_clusters: Option<usize>` (default `Some(2)`, mirroring sklearn's
/// `n_clusters=2` default) and `distance_threshold: Option<F>` (default
/// `None`). [`with_distance_threshold`](Self::with_distance_threshold) sets the
/// threshold and clears `n_clusters` to mirror the sklearn idiom
/// `AgglomerativeClustering(n_clusters=None, distance_threshold=t)`.
///
/// # Type Parameters
///
/// - `F`: floating-point scalar type (`f32` or `f64`).
#[derive(Debug, Clone)]
pub struct AgglomerativeClustering<F> {
    /// Target number of clusters. `None` when [`distance_threshold`] drives the
    /// cut instead (mirrors sklearn `n_clusters=None`).
    ///
    /// [`distance_threshold`]: Self::distance_threshold
    pub n_clusters: Option<usize>,
    /// Linkage strategy for computing inter-cluster distances.
    pub linkage: Linkage,
    /// The linkage distance threshold at or above which clusters will not be
    /// merged. `None` when [`n_clusters`] drives the cut (mirrors sklearn
    /// `distance_threshold=None`).
    ///
    /// [`n_clusters`]: Self::n_clusters
    pub distance_threshold: Option<F>,
    /// Whether to compute and store the per-merge `distances_` even when
    /// `distance_threshold` is not set (mirrors sklearn `compute_distances`,
    /// default `false`). `distances_` is always computed when
    /// `distance_threshold` is set (`return_distance`, `_agglomerative.py:1074`).
    pub compute_distances: bool,
    /// Phantom to retain the float type parameter.
    _marker: std::marker::PhantomData<F>,
}

impl<F: Float> AgglomerativeClustering<F> {
    /// Create a new `AgglomerativeClustering` with the given number of clusters.
    ///
    /// Uses default `linkage = Ward`, `distance_threshold = None`,
    /// `compute_distances = false`.
    #[must_use]
    pub fn new(n_clusters: usize) -> Self {
        Self {
            n_clusters: Some(n_clusters),
            linkage: Linkage::Ward,
            distance_threshold: None,
            compute_distances: false,
            _marker: std::marker::PhantomData,
        }
    }

    /// Set the linkage criterion.
    #[must_use]
    pub fn with_linkage(mut self, linkage: Linkage) -> Self {
        self.linkage = linkage;
        self
    }

    /// Drive the cut by a linkage-distance threshold instead of a fixed cluster
    /// count, mirroring sklearn
    /// `AgglomerativeClustering(n_clusters=None, distance_threshold=t)`.
    ///
    /// Sets `distance_threshold = Some(threshold)` and clears `n_clusters` to
    /// `None` (the XOR contract, `_agglomerative.py:1022-1027`). After fitting,
    /// `n_clusters_ = count(distances_ >= threshold) + 1`
    /// (`_agglomerative.py:1090-1093`). Setting the threshold also forces
    /// `distances_` to be computed (`return_distance`, `:1074`).
    #[must_use]
    pub fn with_distance_threshold(mut self, threshold: F) -> Self {
        self.distance_threshold = Some(threshold);
        self.n_clusters = None;
        self
    }

    /// Compute and store the per-merge `distances_` array even when no
    /// `distance_threshold` is set, mirroring sklearn `compute_distances=True`
    /// (`_agglomerative.py:959`, `:1074`, `:1087-1088`).
    #[must_use]
    pub fn with_compute_distances(mut self, compute_distances: bool) -> Self {
        self.compute_distances = compute_distances;
        self
    }
}

// ─────────────────────────────────────────────────────────────────────────────
// Fitted model
// ─────────────────────────────────────────────────────────────────────────────

/// Fitted Agglomerative Clustering model.
///
/// Stores per-sample cluster labels, the actual cluster count, and the
/// merge history (dendrogram).
///
/// There is intentionally **no** `predict` method: new data cannot be
/// assigned to clusters without re-running the full algorithm.
#[derive(Debug, Clone)]
pub struct FittedAgglomerativeClustering<F> {
    /// Cluster label for each training sample, shape `(n_samples,)`.
    /// Labels are in the range `0 .. n_clusters_`.
    pub labels_: Array1<usize>,
    /// The actual number of clusters formed.
    pub n_clusters_: usize,
    /// The full dendrogram, shape `(n_samples - 1, 2)`.
    ///
    /// Each element `(a, b)` records that nodes `a` and `b` were merged to form
    /// node `n_samples + i` (where `i` is the row index). Values `< n_samples`
    /// are leaves (original samples); values `>= n_samples` are internal nodes.
    /// This matches scikit-learn's `children_` attribute and
    /// `scipy.cluster.hierarchy.linkage(X, method, 'euclidean')[:, :2]`.
    pub children_: Vec<(usize, usize)>,
    /// The per-merge linkage distances, in `children_` row order, length
    /// `n_samples - 1`. `Some` when `distance_threshold` was set OR
    /// `compute_distances` was `true` (mirroring sklearn's `return_distance`
    /// path, `_agglomerative.py:1074`, `:1087-1088`), `None` otherwise.
    ///
    /// Equal to `scipy.cluster.hierarchy.linkage(X, method, 'euclidean')[:, 2]`
    /// for ward/complete/average and to sklearn's own `distances_` for single.
    pub distances_: Option<Array1<F>>,
    /// The number of leaves in the hierarchical tree, equal to `n_samples` for
    /// the unstructured (`connectivity=None`) path
    /// (`_agglomerative.py:1083-1085`).
    pub n_leaves_: usize,
    /// The estimated number of connected components in the graph. Always `1` for
    /// the unstructured (`connectivity=None`) path
    /// (`_agglomerative.py:1083-1085`).
    pub n_connected_components_: usize,
    /// Phantom to retain the float type parameter.
    _marker: std::marker::PhantomData<F>,
}

impl<F: Float> FittedAgglomerativeClustering<F> {
    /// Return the cluster label for each training sample.
    #[must_use]
    pub fn labels(&self) -> &Array1<usize> {
        &self.labels_
    }

    /// Return the number of clusters formed.
    #[must_use]
    pub fn n_clusters(&self) -> usize {
        self.n_clusters_
    }

    /// Return the full dendrogram, shape `(n_samples - 1, 2)`.
    ///
    /// Row `i` is the pair of node IDs merged to form node `n_samples + i`
    /// (leaves are `0..n_samples`, internal nodes are `>= n_samples`). This
    /// matches scikit-learn's `children_` /
    /// `scipy.cluster.hierarchy.linkage(...)[:, :2]`.
    #[must_use]
    pub fn children(&self) -> &[(usize, usize)] {
        &self.children_
    }

    /// Return the per-merge linkage distances (in `children_` row order), or
    /// `None` if they were not computed (neither `distance_threshold` nor
    /// `compute_distances` was set). Mirrors sklearn's `distances_` attribute
    /// (`_agglomerative.py:1087-1088`).
    #[must_use]
    pub fn distances(&self) -> Option<&Array1<F>> {
        self.distances_.as_ref()
    }

    /// Return the number of leaves in the hierarchical tree (`= n_samples` for
    /// the unstructured path). Mirrors sklearn's `n_leaves_`
    /// (`_agglomerative.py:1083-1085`).
    #[must_use]
    pub fn n_leaves(&self) -> usize {
        self.n_leaves_
    }

    /// Return the number of connected components (`= 1` for the unstructured
    /// path). Mirrors sklearn's `n_connected_components_`
    /// (`_agglomerative.py:1083-1085`).
    #[must_use]
    pub fn n_connected_components(&self) -> usize {
        self.n_connected_components_
    }
}

// ─────────────────────────────────────────────────────────────────────────────
// Internal helpers
// ─────────────────────────────────────────────────────────────────────────────

/// Compute the *Euclidean* distance between two row slices.
///
/// scipy's `linkage`/`ward` operate on actual (non-squared) Euclidean distances;
/// the Ward Lance–Williams update below works on these directly.
#[inline]
fn euclidean<F: Float>(a: &[F], b: &[F]) -> f64 {
    a.iter()
        .zip(b.iter())
        .fold(0.0_f64, |acc, (&ai, &bi)| {
            let d = ai.to_f64().unwrap_or(0.0) - bi.to_f64().unwrap_or(0.0);
            acc + d * d
        })
        .sqrt()
}

/// Index into the condensed (upper-triangular) distance vector for the pair
/// `(i, j)` over `n` observations, mirroring scipy's `condensed_index`.
#[inline]
fn condensed_index(n: usize, i: usize, j: usize) -> usize {
    if i < j {
        n * i - (i * (i + 1)) / 2 + (j - i - 1)
    } else {
        n * j - (j * (j + 1)) / 2 + (i - j - 1)
    }
}

/// Lance–Williams distance update for the nn-chain linkages, matching scipy's
/// `linkage_distance_update` family (operating on Euclidean distances).
#[inline]
fn lance_williams(
    linkage: Linkage,
    d_xi: f64,
    d_yi: f64,
    size_x: f64,
    size_y: f64,
    size_i: f64,
    d_xy: f64,
) -> f64 {
    match linkage {
        Linkage::Single => d_xi.min(d_yi),
        Linkage::Complete => d_xi.max(d_yi),
        Linkage::Average => (size_x * d_xi + size_y * d_yi) / (size_x + size_y),
        Linkage::Ward => {
            let t = 1.0 / (size_x + size_y + size_i);
            ((size_i + size_x) * t * d_xi * d_xi + (size_i + size_y) * t * d_yi * d_yi
                - size_i * t * d_xy * d_xy)
                .sqrt()
        }
    }
}

/// A single merge produced by the chain/MST builders, before relabelling.
/// `(node_a, node_b, distance)` with the original observation indices.
type RawMerge = (usize, usize, f64);

/// Nearest-neighbour-chain dendrogram builder for ward / complete / average,
/// mirroring scipy's `nn_chain` (`scipy/cluster/_hierarchy.pyx`). Returns the
/// `n-1` merges (unordered) as `(x, y, dist)` with `x < y` original-index roots
/// at merge time — exactly scipy's `Z[:, :3]` before the final sort/relabel.
fn nn_chain(condensed: &[f64], n: usize, linkage: Linkage) -> Vec<RawMerge> {
    let mut d = condensed.to_vec();
    let mut size = vec![1.0_f64; n];
    let mut chain: Vec<usize> = vec![0; n];
    let mut chain_len = 0usize;
    let mut merges: Vec<RawMerge> = Vec::with_capacity(n.saturating_sub(1));

    for _ in 0..n.saturating_sub(1) {
        if chain_len == 0 {
            chain_len = 1;
            // First active cluster (size > 0).
            for (i, &s) in size.iter().enumerate() {
                if s > 0.0 {
                    chain[0] = i;
                    break;
                }
            }
        }

        let mut x;
        let mut y;
        let mut current_min;
        loop {
            x = chain[chain_len - 1];
            if chain_len > 1 {
                y = chain[chain_len - 2];
                current_min = d[condensed_index(n, x, y)];
            } else {
                y = usize::MAX;
                current_min = f64::INFINITY;
            }
            for (i, &si) in size.iter().enumerate() {
                if si == 0.0 || x == i {
                    continue;
                }
                let dist = d[condensed_index(n, x, i)];
                if dist < current_min {
                    current_min = dist;
                    y = i;
                }
            }
            if chain_len > 1 && y == chain[chain_len - 2] {
                break;
            }
            chain[chain_len] = y;
            chain_len += 1;
        }

        chain_len -= 2;

        // Merge x and y; ensure x < y (scipy's convention).
        if x > y {
            std::mem::swap(&mut x, &mut y);
        }
        let nx = size[x];
        let ny = size[y];
        merges.push((x, y, current_min));

        size[x] = 0.0;
        size[y] = nx + ny;

        // Lance–Williams update of distances to the merged cluster (stored in y).
        for i in 0..n {
            let ni = size[i];
            if ni == 0.0 || i == y {
                continue;
            }
            let d_ix = d[condensed_index(n, i, x)];
            let d_iy = d[condensed_index(n, i, y)];
            d[condensed_index(n, i, y)] =
                lance_williams(linkage, d_ix, d_iy, nx, ny, ni, current_min);
        }
    }

    merges
}

/// Prim's MST builder for single linkage, mirroring sklearn's
/// `mst_linkage_core` (`_hierarchical_fast.pyx`). Distances are generated on the
/// fly (Euclidean). Returns `n-1` merges as `(current_node, new_node, dist)` in
/// MST-construction order (NOT sorted).
fn mst_single<F: Float>(x: &Array2<F>, n: usize) -> Vec<RawMerge> {
    let rows: Vec<&[F]> = (0..n).map(|i| row_slice(x, i)).collect();
    let mut in_tree = vec![false; n];
    let mut current_distances = vec![f64::INFINITY; n];
    let mut current_node = 0usize;
    let mut merges: Vec<RawMerge> = Vec::with_capacity(n.saturating_sub(1));

    for _ in 0..n.saturating_sub(1) {
        in_tree[current_node] = true;
        let mut new_distance = f64::INFINITY;
        let mut new_node = 0usize;

        for j in 0..n {
            if in_tree[j] {
                continue;
            }
            let left_value = euclidean(rows[current_node], rows[j]);
            if left_value < current_distances[j] {
                current_distances[j] = left_value;
            }
            if current_distances[j] < new_distance {
                new_distance = current_distances[j];
                new_node = j;
            }
        }

        merges.push((current_node, new_node, new_distance));
        current_node = new_node;
    }

    merges
}

/// The full dendrogram plus the per-merge distances in `children_` row order.
/// `distances[i]` is the linkage distance at which `children[i]` merged — the
/// 3rd column of the scipy linkage matrix / sklearn's `distances_`
/// (`_agglomerative.py:1087-1088`).
type Dendrogram = (Vec<(usize, usize)>, Vec<f64>);

/// Union-find relabel for the nn-chain linkages, mirroring scipy's `label`
/// (`scipy/cluster/_hierarchy.pyx`). Merges must already be **stably sorted by
/// distance**. Emits the full dendrogram with the smaller root first, plus the
/// per-merge distances in that same (sorted) row order — `scipy.linkage(...)[:, 2]`.
fn union_find_relabel(mut merges: Vec<RawMerge>, n: usize) -> Dendrogram {
    // Stable sort by merge distance (scipy uses `argsort(kind='stable')`).
    merges.sort_by(|a, b| a.2.partial_cmp(&b.2).unwrap_or(std::cmp::Ordering::Equal));

    let mut parent: Vec<usize> = (0..(2 * n).saturating_sub(1)).collect();
    let mut out: Vec<(usize, usize)> = Vec::with_capacity(n.saturating_sub(1));
    let mut dists: Vec<f64> = Vec::with_capacity(n.saturating_sub(1));

    for (i, (a, b, dist)) in merges.into_iter().enumerate() {
        let next_label = n + i;
        let ra = uf_find(&mut parent, a);
        let rb = uf_find(&mut parent, b);
        let (lo, hi) = if ra < rb { (ra, rb) } else { (rb, ra) };
        out.push((lo, hi));
        dists.push(dist);
        parent[ra] = next_label;
        parent[rb] = next_label;
    }

    (out, dists)
}

/// Union-find relabel for single linkage, mirroring sklearn's
/// `_single_linkage_label` + `UnionFind` (`_hierarchical_fast.pyx`). The MST must
/// already be **stably sorted by weight**. Unlike [`union_find_relabel`], the
/// pair is emitted in `(left_root, right_root)` order WITHOUT reordering. The
/// per-merge distances are returned in that same (sorted) row order, equal to
/// sklearn's `distances_`.
fn single_linkage_relabel(mut merges: Vec<RawMerge>, n: usize) -> Dendrogram {
    merges.sort_by(|a, b| a.2.partial_cmp(&b.2).unwrap_or(std::cmp::Ordering::Equal));

    let mut parent: Vec<usize> = (0..(2 * n).saturating_sub(1)).collect();
    let mut out: Vec<(usize, usize)> = Vec::with_capacity(n.saturating_sub(1));
    let mut dists: Vec<f64> = Vec::with_capacity(n.saturating_sub(1));

    for (i, (left, right, dist)) in merges.into_iter().enumerate() {
        let next_label = n + i;
        let lc = uf_find(&mut parent, left);
        let rc = uf_find(&mut parent, right);
        out.push((lc, rc));
        dists.push(dist);
        parent[lc] = next_label;
        parent[rc] = next_label;
    }

    (out, dists)
}

/// Find with path compression over a parent array where a self-parent denotes a
/// root. (Matches the union-find semantics of scipy `label` / sklearn
/// `UnionFind.fast_find`.)
fn uf_find(parent: &mut [usize], mut node: usize) -> usize {
    let mut root = node;
    while parent[root] != root {
        root = parent[root];
    }
    // Path compression.
    while parent[node] != root {
        let next = parent[node];
        parent[node] = root;
        node = next;
    }
    root
}

/// Collect a contiguous row slice for sample `i`, falling back to a copy when
/// the row is not contiguous.
fn row_slice<F: Float>(x: &Array2<F>, i: usize) -> &[F] {
    x.row(i).to_slice().unwrap_or(&[])
}

/// Build the FULL dendrogram (shape `(n_samples - 1, 2)`) for the given linkage,
/// bit-exact with `scipy.cluster.hierarchy.linkage(X, method, 'euclidean')[:, :2]`
/// (ward == `method='ward'`). Leaves are `0..n`, the `i`-th sorted merge forms
/// node `n + i`. Also returns the per-merge distances in `children_` row order
/// (the 3rd column of the scipy linkage matrix / sklearn's `distances_`).
fn full_dendrogram<F: Float>(x: &Array2<F>, linkage: Linkage) -> Dendrogram {
    let n = x.nrows();
    if n < 2 {
        return (Vec::new(), Vec::new());
    }

    match linkage {
        Linkage::Single => {
            let merges = mst_single(x, n);
            single_linkage_relabel(merges, n)
        }
        _ => {
            // Build the condensed Euclidean distance vector.
            let rows: Vec<&[F]> = (0..n).map(|i| row_slice(x, i)).collect();
            let mut condensed = vec![0.0_f64; n * (n - 1) / 2];
            let mut k = 0;
            for i in 0..n {
                for j in (i + 1)..n {
                    condensed[k] = euclidean(rows[i], rows[j]);
                    k += 1;
                }
            }
            let merges = nn_chain(&condensed, n, linkage);
            union_find_relabel(merges, n)
        }
    }
}

/// Collect all descendant leaves of `node` in the dendrogram, mirroring
/// sklearn's `_hc_get_descendent` (`_hierarchical_fast.pyx`). Leaves are
/// `< n_leaves`; an internal node `i` has children `children[i - n_leaves]`.
fn hc_get_descendent(node: usize, children: &[(usize, usize)], n_leaves: usize) -> Vec<usize> {
    if node < n_leaves {
        return vec![node];
    }
    let mut stack = vec![node];
    let mut descendents = Vec::new();
    while let Some(i) = stack.pop() {
        if i < n_leaves {
            descendents.push(i);
        } else {
            let (a, b) = children[i - n_leaves];
            stack.push(a);
            stack.push(b);
        }
    }
    descendents
}

/// Cut the full dendrogram into `n_clusters` clusters, numbering each leaf by
/// the heap position of its top-level ancestor — a faithful, operation-for-
/// operation translation of sklearn's `_hc_cut` (`_agglomerative.py:731-775`).
///
/// `nodes` stores NEGATED node ids in a Python-`heapq` MIN-heap (so the smallest
/// element is the largest node id). We replicate `heappush` and `heappushpop`
/// exactly, then `enumerate` the heap array in its final layout order — both the
/// heap-array layout and the enumeration order are reproduced verbatim, so the
/// absolute `labels_` match sklearn bit-for-bit.
fn hc_cut(n_clusters: usize, children: &[(usize, usize)], n_leaves: usize) -> Array1<usize> {
    let mut labels = Array1::<usize>::zeros(n_leaves);
    if n_leaves == 0 || children.is_empty() {
        // Single leaf (or no merges) → one cluster (label 0 everywhere).
        return labels;
    }

    // nodes = [-(max(children[-1]) + 1)]  (negated root id).
    let last = children[children.len() - 1];
    let root = (last.0.max(last.1) + 1) as i64;
    let mut nodes: Vec<i64> = vec![-root];

    for _ in 0..n_clusters.saturating_sub(1) {
        // these_children = children[-nodes[0] - n_leaves]; nodes[0] is the
        // min (= most-negative) element of the Python heap.
        let smallest = nodes[0]; // negated => smallest negated == largest id
        let node_id = (-smallest) as usize;
        let these = children[node_id - n_leaves];
        // heappush(nodes, -these[0]); heappushpop(nodes, -these[1]);
        heappush_min(&mut nodes, -(these.0 as i64));
        heappushpop_min(&mut nodes, -(these.1 as i64));
    }

    // for i, node in enumerate(nodes): label[descendents(-node)] = i
    for (i, &neg_node) in nodes.iter().enumerate() {
        let node = (-neg_node) as usize;
        for leaf in hc_get_descendent(node, children, n_leaves) {
            labels[leaf] = i;
        }
    }
    labels
}

/// CPython `heapq.heappush` (sift-down of the new last element toward the root),
/// reproducing the exact array layout so `_hc_cut`'s `enumerate(nodes)` order
/// matches.
fn heappush_min(heap: &mut Vec<i64>, item: i64) {
    heap.push(item);
    let last = heap.len() - 1;
    sift_down(heap, 0, last);
}

/// CPython `heapq.heappushpop`: push `item`, then pop-and-return the smallest.
/// If `item` is no larger than the current min, it is returned unchanged
/// (matching CPython's fast path) and the heap is untouched.
fn heappushpop_min(heap: &mut [i64], item: i64) -> i64 {
    if !heap.is_empty() && heap[0] < item {
        let returned = heap[0];
        heap[0] = item;
        sift_up(heap, 0);
        return returned;
    }
    item
}

/// CPython `heapq._siftdown(heap, startpos, pos)`: bubble `heap[pos]` up toward
/// `startpos` while it is smaller than its parent.
fn sift_down(heap: &mut [i64], startpos: usize, pos: usize) {
    let mut pos = pos;
    let new_item = heap[pos];
    while pos > startpos {
        let parentpos = (pos - 1) >> 1;
        let parent = heap[parentpos];
        if new_item < parent {
            heap[pos] = parent;
            pos = parentpos;
        } else {
            break;
        }
    }
    heap[pos] = new_item;
}

/// CPython `heapq._siftup(heap, pos)`: move `heap[pos]` down to a leaf along the
/// path of smaller children, then sift it back up to its correct spot. This is
/// CPython's exact (non-obvious) layout-preserving variant.
fn sift_up(heap: &mut [i64], pos: usize) {
    let endpos = heap.len();
    let startpos = pos;
    let mut pos = pos;
    let new_item = heap[pos];
    let mut childpos = 2 * pos + 1;
    while childpos < endpos {
        let rightpos = childpos + 1;
        if rightpos < endpos && heap[childpos] >= heap[rightpos] {
            childpos = rightpos;
        }
        heap[pos] = heap[childpos];
        pos = childpos;
        childpos = 2 * pos + 1;
    }
    heap[pos] = new_item;
    sift_down(heap, startpos, pos);
}

/// The realised cut: `(labels_, children_, n_clusters_, distances_)`.
/// `distances_` is `Some` iff the distances were requested (`return_distance`).
type AgglomerateResult<F> =
    Result<(Array1<usize>, Vec<(usize, usize)>, usize, Option<Array1<F>>), FerroError>;

/// Run the full unstructured agglomerative pipeline for `x`, mirroring sklearn's
/// `_fit` (`_agglomerative.py:1066-1106`) for the `connectivity=None` case:
///
/// 1. build the full dendrogram (`children_` + per-merge `distances_`);
/// 2. `return_distance = (distance_threshold is not None) or compute_distances`
///    (`:1074`) decides whether `distances_` is surfaced;
/// 3. `n_clusters_` is either `count(distances_ >= threshold) + 1`
///    (`:1090-1093`) or the requested `n_clusters` (`:1095`);
/// 4. `labels_ = _hc_cut(n_clusters_, children_, n_leaves)` (`:1099`).
fn agglomerate<F: Float>(
    x: &Array2<F>,
    n_clusters: Option<usize>,
    distance_threshold: Option<F>,
    compute_distances: bool,
    linkage: Linkage,
) -> AgglomerateResult<F> {
    let n_samples = x.nrows();
    let (children, raw_dists) = full_dendrogram(x, linkage);

    // return_distance (`_agglomerative.py:1074`).
    let return_distance = distance_threshold.is_some() || compute_distances;

    // Convert the per-merge distances to F (computed in f64 internally).
    let distances_f: Vec<F> = raw_dists
        .iter()
        .map(|&d| F::from(d).unwrap_or_else(F::zero))
        .collect();

    // n_clusters_ (`_agglomerative.py:1090-1095`).
    let n_clusters_ = match distance_threshold {
        Some(t) => {
            // np.count_nonzero(distances_ >= distance_threshold) + 1.
            // sklearn's `distances_` and the threshold are float64 (scipy's
            // `linkage`/`ward_tree` return float64 heights regardless of
            // `X.dtype`), so count against the RAW f64 heights and an
            // f64-promoted threshold — NOT the F-downcast copy, whose f32
            // rounding of a merge height would flip the count near the
            // threshold (#2185).
            let t_f64 = t.to_f64().unwrap_or(f64::NAN);
            let count = raw_dists.iter().filter(|&&d| d >= t_f64).count();
            count + 1
        }
        None => n_clusters.unwrap_or(0),
    };

    let labels = hc_cut(n_clusters_, &children, n_samples);

    let distances_out = if return_distance {
        Some(Array1::from(distances_f))
    } else {
        None
    };

    Ok((labels, children, n_clusters_, distances_out))
}

// ─────────────────────────────────────────────────────────────────────────────
// Trait impl: Fit
// ─────────────────────────────────────────────────────────────────────────────

impl<F: Float + Send + Sync + 'static> Fit<Array2<F>, ()> for AgglomerativeClustering<F> {
    type Fitted = FittedAgglomerativeClustering<F>;
    type Error = FerroError;

    /// Run agglomerative clustering on `x`.
    ///
    /// # Errors
    ///
    /// - [`FerroError::InvalidParameter`] (`"n_clusters"` / `"distance_threshold"`)
    ///   when neither or both of `n_clusters`/`distance_threshold` is set
    ///   (sklearn's XOR, `_agglomerative.py:1022-1027`), or when `n_clusters == 0`.
    /// - [`FerroError::InsufficientSamples`] if `n_samples == 0`, or (in the
    ///   fixed-`n_clusters` path) `n_samples < n_clusters`.
    fn fit(&self, x: &Array2<F>, _y: &()) -> Result<FittedAgglomerativeClustering<F>, FerroError> {
        // XOR: exactly one of n_clusters / distance_threshold must be set.
        // Mirrors `not ((n_clusters is None) ^ (distance_threshold is None))`
        // (`_agglomerative.py:1022-1027`).
        match (self.n_clusters, self.distance_threshold) {
            (Some(_), Some(_)) | (None, None) => {
                return Err(FerroError::InvalidParameter {
                    name: "distance_threshold".into(),
                    reason: "Exactly one of n_clusters and distance_threshold has to be set, \
                             and the other needs to be None"
                        .into(),
                });
            }
            _ => {}
        }

        if let Some(k) = self.n_clusters
            && k == 0
        {
            return Err(FerroError::InvalidParameter {
                name: "n_clusters".into(),
                reason: "must be at least 1".into(),
            });
        }

        let n_samples = x.nrows();

        if n_samples == 0 {
            return Err(FerroError::InsufficientSamples {
                required: self.n_clusters.unwrap_or(1),
                actual: 0,
                context: "AgglomerativeClustering requires at least n_clusters samples".into(),
            });
        }

        if let Some(k) = self.n_clusters
            && n_samples < k
        {
            return Err(FerroError::InsufficientSamples {
                required: k,
                actual: n_samples,
                context: "AgglomerativeClustering requires at least n_clusters samples".into(),
            });
        }

        // Reject non-finite X up front (NaN AND Inf), mirroring sklearn's
        // `_validate_data(X, ensure_min_samples=2)` (`force_all_finite=True`
        // default) reached from `AgglomerativeClustering.fit`
        // (`sklearn/cluster/_agglomerative.py:989`), which raises
        // `ValueError("Input X contains NaN.")` (`validation.py:147-154`). This
        // MUST precede `agglomerate`: NaN/Inf distances otherwise mis-route the
        // nn-chain and underflow `condensed_index`'s `usize` arithmetic (the
        // #2282 panic). Never panics (R-CODE-2).
        if x.iter().any(|v| !v.is_finite()) {
            return Err(FerroError::InvalidParameter {
                name: "X".into(),
                reason: "Input X contains NaN or infinity.".into(),
            });
        }

        let (labels, children, n_clusters_, distances) = agglomerate(
            x,
            self.n_clusters,
            self.distance_threshold,
            self.compute_distances,
            self.linkage,
        )?;

        Ok(FittedAgglomerativeClustering {
            labels_: labels,
            n_clusters_,
            children_: children,
            distances_: distances,
            // Unstructured (connectivity=None): n_leaves_ == n_samples,
            // n_connected_components_ == 1 (`_agglomerative.py:1083-1085`).
            n_leaves_: n_samples,
            n_connected_components_: 1,
            _marker: std::marker::PhantomData,
        })
    }
}

impl<F: Float + Send + Sync + 'static> AgglomerativeClustering<F> {
    /// Fit on `x` and return the cluster labels for those samples in one
    /// call. Equivalent to sklearn `ClusterMixin.fit_predict`.
    ///
    /// # Errors
    ///
    /// Forwards any error from [`Fit::fit`].
    pub fn fit_predict(&self, x: &Array2<F>) -> Result<Array1<usize>, FerroError> {
        let fitted = self.fit(x, &())?;
        Ok(fitted.labels().clone())
    }
}

// ─────────────────────────────────────────────────────────────────────────────
// Tests
// ─────────────────────────────────────────────────────────────────────────────

#[cfg(test)]
mod tests {
    use super::*;

    /// Two well-separated blobs.
    fn make_two_blobs() -> Array2<f64> {
        Array2::from_shape_vec(
            (8, 2),
            vec![
                0.0, 0.0, 0.1, 0.0, 0.0, 0.1, 0.05, 0.05, 10.0, 10.0, 10.1, 10.0, 10.0, 10.1,
                10.05, 10.05,
            ],
        )
        .unwrap()
    }

    /// Three well-separated blobs.
    fn make_three_blobs() -> Array2<f64> {
        Array2::from_shape_vec(
            (9, 2),
            vec![
                0.0, 0.0, 0.1, 0.1, -0.1, 0.1, 10.0, 10.0, 10.1, 10.1, 9.9, 10.1, 0.0, 10.0, 0.1,
                10.1, -0.1, 9.9,
            ],
        )
        .unwrap()
    }

    /// Three collinear points, well separated (infallible `arr2` fixture).
    fn three_points() -> Array2<f64> {
        ndarray::arr2(&[[0.0, 0.0], [5.0, 5.0], [10.0, 10.0]])
    }

    // ── Construction ────────────────────────────────────────────────────────

    #[test]
    fn test_new_defaults() {
        let model = AgglomerativeClustering::<f64>::new(3);
        assert_eq!(model.n_clusters, Some(3));
        assert_eq!(model.linkage, Linkage::Ward);
    }

    #[test]
    fn test_with_linkage() {
        let model = AgglomerativeClustering::<f64>::new(2).with_linkage(Linkage::Complete);
        assert_eq!(model.linkage, Linkage::Complete);
    }

    // ── Error conditions ────────────────────────────────────────────────────

    #[test]
    fn test_zero_clusters_error() {
        let x = make_two_blobs();
        let result = AgglomerativeClustering::<f64>::new(0).fit(&x, &());
        assert!(result.is_err());
    }

    #[test]
    fn test_empty_data_error() {
        let x = Array2::<f64>::zeros((0, 2));
        let result = AgglomerativeClustering::<f64>::new(2).fit(&x, &());
        assert!(result.is_err());
    }

    #[test]
    fn test_more_clusters_than_samples_error() {
        let x = Array2::from_shape_vec((2, 2), vec![1.0, 1.0, 2.0, 2.0]).unwrap();
        let result = AgglomerativeClustering::<f64>::new(5).fit(&x, &());
        assert!(result.is_err());
    }

    // ── Ward linkage ────────────────────────────────────────────────────────

    #[test]
    fn test_ward_two_blobs() {
        let x = make_two_blobs();
        let fitted = AgglomerativeClustering::<f64>::new(2)
            .with_linkage(Linkage::Ward)
            .fit(&x, &())
            .unwrap();
        let labels = fitted.labels();
        // First 4 should be in the same cluster; last 4 in another.
        assert_eq!(labels[0], labels[1]);
        assert_eq!(labels[0], labels[2]);
        assert_eq!(labels[0], labels[3]);
        assert_eq!(labels[4], labels[5]);
        assert_eq!(labels[4], labels[6]);
        assert_eq!(labels[4], labels[7]);
        assert_ne!(labels[0], labels[4]);
    }

    #[test]
    fn test_ward_three_blobs() {
        let x = make_three_blobs();
        let fitted = AgglomerativeClustering::<f64>::new(3)
            .with_linkage(Linkage::Ward)
            .fit(&x, &())
            .unwrap();
        let labels = fitted.labels();
        assert_eq!(labels[0], labels[1]);
        assert_eq!(labels[3], labels[4]);
        assert_eq!(labels[6], labels[7]);
        assert_ne!(labels[0], labels[3]);
        assert_ne!(labels[0], labels[6]);
        assert_ne!(labels[3], labels[6]);
    }

    // ── Complete linkage ────────────────────────────────────────────────────

    #[test]
    fn test_complete_two_blobs() {
        let x = make_two_blobs();
        let fitted = AgglomerativeClustering::<f64>::new(2)
            .with_linkage(Linkage::Complete)
            .fit(&x, &())
            .unwrap();
        let labels = fitted.labels();
        assert_eq!(labels[0], labels[1]);
        assert_eq!(labels[4], labels[5]);
        assert_ne!(labels[0], labels[4]);
    }

    #[test]
    fn test_complete_three_blobs() {
        let x = make_three_blobs();
        let fitted = AgglomerativeClustering::<f64>::new(3)
            .with_linkage(Linkage::Complete)
            .fit(&x, &())
            .unwrap();
        let labels = fitted.labels();
        assert_ne!(labels[0], labels[3]);
        assert_ne!(labels[0], labels[6]);
    }

    // ── Average linkage ─────────────────────────────────────────────────────

    #[test]
    fn test_average_two_blobs() {
        let x = make_two_blobs();
        let fitted = AgglomerativeClustering::<f64>::new(2)
            .with_linkage(Linkage::Average)
            .fit(&x, &())
            .unwrap();
        let labels = fitted.labels();
        assert_eq!(labels[0], labels[1]);
        assert_eq!(labels[4], labels[5]);
        assert_ne!(labels[0], labels[4]);
    }

    #[test]
    fn test_average_three_blobs() {
        let x = make_three_blobs();
        let fitted = AgglomerativeClustering::<f64>::new(3)
            .with_linkage(Linkage::Average)
            .fit(&x, &())
            .unwrap();
        let labels = fitted.labels();
        assert_ne!(labels[0], labels[3]);
        assert_ne!(labels[0], labels[6]);
    }

    // ── Single linkage ──────────────────────────────────────────────────────

    #[test]
    fn test_single_two_blobs() {
        let x = make_two_blobs();
        let fitted = AgglomerativeClustering::<f64>::new(2)
            .with_linkage(Linkage::Single)
            .fit(&x, &())
            .unwrap();
        let labels = fitted.labels();
        assert_eq!(labels[0], labels[1]);
        assert_eq!(labels[4], labels[5]);
        assert_ne!(labels[0], labels[4]);
    }

    #[test]
    fn test_single_three_blobs() {
        let x = make_three_blobs();
        let fitted = AgglomerativeClustering::<f64>::new(3)
            .with_linkage(Linkage::Single)
            .fit(&x, &())
            .unwrap();
        let labels = fitted.labels();
        assert_ne!(labels[0], labels[3]);
        assert_ne!(labels[0], labels[6]);
    }

    // ── Label properties ─────────────────────────────────────────────────────

    #[test]
    fn test_label_count_equals_n_samples() {
        let x = make_two_blobs();
        let fitted = AgglomerativeClustering::<f64>::new(2).fit(&x, &()).unwrap();
        assert_eq!(fitted.labels().len(), x.nrows());
    }

    #[test]
    fn test_labels_in_valid_range() {
        let x = make_three_blobs();
        let fitted = AgglomerativeClustering::<f64>::new(3).fit(&x, &()).unwrap();
        for &l in fitted.labels() {
            assert!(l < 3, "label {l} out of range");
        }
    }

    #[test]
    fn test_n_clusters_matches_config() {
        let x = make_three_blobs();
        let fitted = AgglomerativeClustering::<f64>::new(3).fit(&x, &()).unwrap();
        assert_eq!(fitted.n_clusters(), 3);
    }

    // ── Children (merge tree) ────────────────────────────────────────────────

    #[test]
    fn test_children_length() {
        // Full dendrogram: children_ has n_samples - 1 rows regardless of
        // n_clusters (matches sklearn / scipy linkage output).
        let x = make_two_blobs(); // 8 samples → 7 dendrogram rows.
        let fitted = AgglomerativeClustering::<f64>::new(2).fit(&x, &());
        let fitted = fitted.as_ref().ok();
        assert!(fitted.is_some());
        if let Some(f) = fitted {
            assert_eq!(f.children().len(), x.nrows() - 1);
        }
    }

    #[test]
    fn test_children_full_tree_when_n_clusters_equals_n_samples() {
        // children_ is the FULL dendrogram (n_samples - 1 rows) even when
        // n_clusters == n_samples — sklearn always builds the full tree.
        let x = three_points();
        let fitted = AgglomerativeClustering::<f64>::new(3).fit(&x, &());
        let fitted = fitted.as_ref().ok();
        assert!(fitted.is_some());
        if let Some(f) = fitted {
            assert_eq!(f.children().len(), 2);
        }
    }

    // ── Special cases ─────────────────────────────────────────────────────────

    #[test]
    fn test_single_cluster() {
        let x = make_two_blobs();
        let fitted = AgglomerativeClustering::<f64>::new(1).fit(&x, &()).unwrap();
        // All samples should be in cluster 0.
        for &l in fitted.labels() {
            assert_eq!(l, 0);
        }
    }

    #[test]
    fn test_n_clusters_equals_n_samples() {
        let x = Array2::from_shape_vec((3, 2), vec![0.0, 0.0, 5.0, 5.0, 10.0, 10.0]).unwrap();
        let fitted = AgglomerativeClustering::<f64>::new(3).fit(&x, &()).unwrap();
        // Each sample is its own cluster; labels should all be distinct.
        let labels = fitted.labels();
        assert_ne!(labels[0], labels[1]);
        assert_ne!(labels[0], labels[2]);
        assert_ne!(labels[1], labels[2]);
    }

    #[test]
    fn test_single_sample_single_cluster() {
        let x = Array2::from_shape_vec((1, 2), vec![3.0, 4.0]).unwrap();
        let fitted = AgglomerativeClustering::<f64>::new(1).fit(&x, &()).unwrap();
        assert_eq!(fitted.labels()[0], 0);
        assert_eq!(fitted.n_clusters(), 1);
        assert!(fitted.children().is_empty());
    }

    #[test]
    fn test_1d_data() {
        let x = Array2::from_shape_vec((6, 1), vec![0.0, 0.1, -0.1, 100.0, 100.1, 99.9]).unwrap();
        let fitted = AgglomerativeClustering::<f64>::new(2).fit(&x, &()).unwrap();
        let labels = fitted.labels();
        assert_eq!(labels[0], labels[1]);
        assert_eq!(labels[0], labels[2]);
        assert_eq!(labels[3], labels[4]);
        assert_ne!(labels[0], labels[3]);
    }

    #[test]
    fn test_f32_support() {
        let x = Array2::<f32>::from_shape_vec(
            (6, 2),
            vec![
                0.0, 0.0, 0.1, 0.0, 0.0, 0.1, 10.0, 10.0, 10.1, 10.0, 10.0, 10.1,
            ],
        )
        .unwrap();
        let fitted = AgglomerativeClustering::<f32>::new(2).fit(&x, &()).unwrap();
        assert_eq!(fitted.labels().len(), 6);
        let labels = fitted.labels();
        assert_eq!(labels[0], labels[1]);
        assert_eq!(labels[3], labels[4]);
        assert_ne!(labels[0], labels[3]);
    }

    #[test]
    fn test_identical_points() {
        // All points identical → all should be in the same cluster.
        let x =
            Array2::from_shape_vec((4, 2), vec![1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]).unwrap();
        let fitted = AgglomerativeClustering::<f64>::new(1).fit(&x, &()).unwrap();
        for &l in fitted.labels() {
            assert_eq!(l, 0);
        }
    }
}