Skip to main content

ferrolearn_preprocess/
polynomial_features.rs

1//! Polynomial features: generate polynomial and interaction features.
2//!
3//! Given input features `[a, b]` and degree 2, this transformer generates:
4//! - `[1, a, b, a², a·b, b²]` (default — `interaction_only = false`)
5//! - `[1, a, b, a·b]` (with `interaction_only = true`)
6//!
7//! With `include_bias = false`, the constant column `1` is omitted.
8//!
9//! This transformer is stateless for value generation — call
10//! [`Transform::transform`] directly. For scikit-learn API parity it ALSO
11//! supports the stateful [`Fit`](ferrolearn_core::traits::Fit) →
12//! [`FittedPolynomialFeatures`] path, which records `n_features_in_` /
13//! `n_output_features_` / `powers_` and validates the input feature count in
14//! `transform`; the fitted type's `transform` REUSES the very same value math as
15//! the stateless path, so both paths are bit-identical to each other. That value
16//! math (FIXED #2210) reproduces scikit-learn's INCREMENTAL degree-by-degree
17//! column build (`_polynomial.py:525-564`) — each degree-`d` column is a stored
18//! degree-`(d-1)` column times a single feature, in sklearn's `np.multiply` arg
19//! order — so the polynomial VALUES are now bit-identical to the sklearn oracle
20//! at EVERY degree, including degree≥3 (the prior independent per-column left-fold
21//! diverged by 1 ULP on degree≥3 terms because FP multiplication is
22//! non-associative).
23//!
24//! # `## REQ status`
25//!
26//! Binary (R-DEFER-2), translating `sklearn/preprocessing/_polynomial.py` (`class
27//! PolynomialFeatures` `:99`). Design doc: `.design/preprocess/polynomial_features.md`. Expected
28//! values from the live sklearn 1.5.2 oracle (R-CHAR-3). Consumers: `PipelineTransformer` impl +
29//! crate re-export (`lib.rs`, grandfathered S5) + the `FittedPolynomialFeatures::transform` fitted
30//! path. HONEST (R-HONEST-3): a dense int-degree transformer — the polynomial VALUES + column ORDER
31//! match sklearn exactly, and the stateful `fit` → `FittedPolynomialFeatures` surface
32//! (`n_features_in_`/`n_output_features_`/`powers_` + the transform feature-count check) SHIPS
33//! (REQ-4/REQ-5), and the `ferrolearn-python` PyO3 binding (`_RsPolynomialFeatures` + the
34//! `ferrolearn.PolynomialFeatures` stateful wrapper surfacing the fitted attributes) NOW SHIPS
35//! (REQ-10); `get_feature_names_out`, degree-tuple, `order` (C/F memory layout), sparse, full-ctor,
36//! ferray stay NOT-STARTED.
37//!
38//! | REQ | Status | Evidence |
39//! |---|---|---|
40//! | REQ-1 (int-degree dense values + column order) | SHIPPED | Column order: `fn feature_combinations` (bias=empty combo, DFS interaction_only strictly-increasing / else non-decreasing, sort by degree-then-lex) reproduces sklearn `_combinations` itertools order (`_polynomial.py:209-220`) column-for-column (this drives `powers_`/`n_output_features_`). Value math: `fn generate_poly_features` builds the columns DEGREE-BY-DEGREE via sklearn's incremental block recurrence (`_polynomial.py:525-564`) — each degree-`d` column = a stored degree-`(d-1)` column × a single feature in sklearn's `np.multiply` arg order — so values are bit-identical to the live oracle at EVERY degree. FIXED #2210: the prior independent per-column left-fold (`combo.iter().fold(1, |acc,j| acc*x[j])`) diverged by 1 ULP on degree≥3 terms (FP multiplication is non-associative); the incremental build now matches bit-for-bit (pin `divergence_polynomial_higher_degree_term_ulp_vs_sklearn`, x1*x2*x3 == `0x1.7b645a1cac082p+2`). Green guards (`tests/divergence_polynomial_features.rs`): 2-feat default `[1,2,3,4,6,9]`, 3-feat default `[1,2,3,5,4,6,10,9,15,25]`, 3-feat interaction `[1,2,3,5,6,10,15]`, deg-3 interaction no-bias `[2,3,5,6,10,15,30]`, multi-row. Consumers: `FittedPipelineTransformer` + re-export. |
41//! | REQ-8 (input validation per check_array) | SHIPPED | FIXED #1180. `transform` guards (sklearn order) zero-samples → `InsufficientSamples`, zero-features → `InvalidParameter`, non-finite NaN/±inf → `InvalidParameter` — matching sklearn `transform` → `_validate_data` (`_polynomial.py:433-435`). Mirrors converged binarizer/normalizer. Critic two-round CLEAN: 13 tests incl. finite input (1e308) with overflowing product correctly ACCEPTED (input-only validation). |
42//! | REQ-2 (degree tuple / min_degree) | NOT-STARTED | open prereq blocker #1181. Single `usize` degree, always starts at degree 1 (sklearn `:334-360`). |
43//! | REQ-3 (order C/F param) | NOT-STARTED | open prereq blocker #1182. No `order` (sklearn `:201`,`:132-134`). |
44//! | REQ-4 (stateful fit + n_features_in_/n_output_features_ + count check) | SHIPPED | FIXED #1183. `impl Fit<Array2<F>, ()> for PolynomialFeatures<F>` (`fit`): runs the SAME REQ-8 `validate_poly_input` guard (zero-samples → `InsufficientSamples`, zero-features/non-finite → `InvalidParameter`, sklearn `_validate_data` `_polynomial.py:323`), enumerates `feature_combinations(x.ncols())` (REUSED, not reimplemented), records `n_features_in_ = x.ncols()` and `n_output_features_ = combinations.len()` (== `_num_combinations`, sklearn `:362`), returns `FittedPolynomialFeatures`. `FittedPolynomialFeatures::transform` validates X (REQ-8) FIRST, generates via the SHARED `generate_poly_features` value math (delegated — byte-identical to the stateless path), THEN checks `x.ncols() != n_features_in_` → `ShapeMismatch` (sklearn `X has N features, but ... expecting M`, `:402-435`). The X-validation-before-n_features ORDER matches sklearn `_validate_data(reset=False)` → `check_array` before `_check_n_features` (#2207). Accessors `n_features_in()`/`n_output_features()`. Live-oracle tests: `fit_n_features_in_matches_ncols`, `fit_n_output_features_*`, `fitted_transform_matches_stateless_and_sklearn`, `fitted_transform_more/fewer_features_shape_mismatch`, `fitted_transform_nan_validation_before_n_features`. Consumers: `FittedPolynomialFeatures::transform` (the fitted path) + crate re-export `pub use polynomial_features::{PolynomialFeatures, FittedPolynomialFeatures}` (`lib.rs`). |
45//! | REQ-5 (powers_ attribute) | SHIPPED | FIXED #1184. `FittedPolynomialFeatures<F>` carries `powers_: Array2<usize>` of shape `(n_output_features_, n_features_in_)` (matching `n_features_in_`'s `usize` storage idiom; sklearn `powers_` is int), built in `fit` from the SAME `feature_combinations` the value math uses (so row order == output column order): each combination → a `bincount` row counting occurrences of each feature index `0..n_features_in_`, the bias empty-combo → an all-zeros row (sklearn `np.vstack([np.bincount(c, minlength=n_features_in_) for c in combinations])`, `_polynomial.py:262-264`). Accessor `powers(&self) -> &Array2<usize>`. Live-oracle tests: `fit_powers_2feat/3feat_*` EXACT-match sklearn `.powers_` for default / `interaction_only` / `include_bias=False`, asserting shape + every entry. Consumer: the `powers_` accessor on the re-exported `FittedPolynomialFeatures`. |
46//! | REQ-6 (get_feature_names_out) | NOT-STARTED | open prereq blocker #1185. None (sklearn `:266-303`). Depends on REQ-5. |
47//! | REQ-7 (sparse CSR/CSC) | NOT-STARTED | open prereq blocker #1186. Dense-only (sklearn `:402-`,`:38-96`). |
48//! | REQ-9 (full ctor + _parameter_constraints) | NOT-STARTED | open prereq blocker #1187. Positional `new`; validates only the `degree==0 && !include_bias` empty-output case (FIXED #2216: sklearn `_polynomial.py:326-330` rejects ONLY that case — `degree==0 && include_bias` is valid, emitting just the bias column). No keyword-only `*`-args, no `order`, no full `_parameter_constraints` (sklearn `:194-207`). |
49//! | REQ-10 (PyO3 binding) | SHIPPED | FIXED #1188. `_RsPolynomialFeatures` (hand `#[pyclass]`, `ferrolearn-python/src/extras.rs`) over `ferrolearn_preprocess::{PolynomialFeatures, FittedPolynomialFeatures}<f64>` exposes `fit(x)` (ctor `PolynomialFeatures::new(degree, interaction_only, include_bias)`, `degree==0 && !include_bias`/NaN/±inf → `PyValueError`; FIXED #2216 `degree==0 && include_bias` is valid → bias-only column), `transform(x)` (`FittedPolynomialFeatures::transform` — wrong col count → `ShapeMismatch` → `PyValueError`), and `#[getter]`s `n_features_in_` (sklearn `_polynomial.py:323`), `n_output_features_` (`:362`), `powers_` (`:250-264`, `usize` `Array2` → numpy i64). Registered in `lib.rs`. The `_extras.py::PolynomialFeatures(_TransformerWrapper)` wrapper mirrors sklearn's ctor `(degree=2, *, interaction_only=False, include_bias=True, order="C")` (`_polynomial.py:201`, `degree` positional-or-keyword), is STATEFUL (overridden `fit` builds/fits `_rs`; pre-fit `transform`/attribute access → `NotFittedError` via `check_is_fitted`), does the FLOAT-only dtype cast-back (`check_array(dtype=FLOAT_DTYPES)`, `:432`: int→f64, float32→float32, float64→float64), and surfaces `n_features_in_`/`n_output_features_`/`powers_` as `@property`s reading from `_rs`. `order` is validated against sklearn's `StrOptions({"C","F"})` (bad string → `ValueError`); `'F'` is accepted-but-documented (the returned array is always C-contiguous — F memory-layout is REQ-3 #1182 NOT-STARTED — but the VALUES match sklearn's 'F' output, which is a layout-only knob). #2215-CLASS f64-ABI CAVEAT: for FLOAT32 input the polynomial products are computed in float64 (the f64 ABI) then cast back to float32 (sklearn computes them in float32), so the cast-back float32 VALUES can diverge from sklearn's by ~1e-6 on high-degree terms (the same reduced-precision caveat class as #2215/#2205) — float64 input is bit-exact; the float32 value pin uses a TOLERANT `atol=1e-5`. Exported in `__init__.py`. Non-test consumer: `ferrolearn.PolynomialFeatures(...).fit_transform(X)` + `lib.rs` `add_class` + `__init__.py` re-export. Live oracle: `tests/divergence_polynomial_features_py.py` (33 pass, sklearn 1.5.2) — fit_transform VALUES (d∈{2,3} × default/interaction_only/no-bias, float64 bit-exact), `powers_` int-exact, `n_output_features_`/`n_features_in_`, pre-fit `NotFittedError`, wrong-n_features `ValueError`, dtype int→f64/float32-tolerant/float64-bit-exact, get_params keys=={degree,interaction_only,include_bias,order}/defaults/clone/set_params, pipeline, order='F' values + bad-order `ValueError`. `get_feature_names_out` stays NOT-STARTED (REQ-6 #1185). |
50//! | REQ-11 (ferray substrate) | NOT-STARTED | open prereq blocker #1189. `ndarray`+`num_traits`, not `ferray-core` (R-SUBSTRATE-1/2). |
51
52use ferrolearn_core::error::FerroError;
53use ferrolearn_core::pipeline::{FittedPipelineTransformer, PipelineTransformer};
54use ferrolearn_core::traits::{Fit, Transform};
55use ndarray::{Array1, Array2};
56use num_traits::Float;
57
58// ---------------------------------------------------------------------------
59// PolynomialFeatures
60// ---------------------------------------------------------------------------
61
62/// A stateless polynomial feature generator.
63///
64/// Generates all polynomial combinations of the input features up to the
65/// specified degree.
66///
67/// # Configuration
68///
69/// - `degree`: maximum polynomial degree (default `2`).
70/// - `interaction_only`: if `true`, only cross-product terms are generated
71///   (no pure powers like `a²`). Default `false`.
72/// - `include_bias`: if `true`, a constant column of ones is prepended.
73///   Default `true`.
74///
75/// # Examples
76///
77/// ```
78/// use ferrolearn_preprocess::polynomial_features::PolynomialFeatures;
79/// use ferrolearn_core::traits::Transform;
80/// use ndarray::array;
81///
82/// let poly = PolynomialFeatures::<f64>::new(2, false, true).unwrap();
83/// let x = array![[2.0, 3.0]];
84/// let out = poly.transform(&x).unwrap();
85/// // out = [[1, 2, 3, 4, 6, 9]]
86/// ```
87#[derive(Debug, Clone)]
88pub struct PolynomialFeatures<F> {
89    /// Maximum polynomial degree.
90    pub(crate) degree: usize,
91    /// If `true`, only interaction terms are produced (no pure powers).
92    pub(crate) interaction_only: bool,
93    /// If `true`, prepend a bias (constant ones) column.
94    pub(crate) include_bias: bool,
95    _marker: std::marker::PhantomData<F>,
96}
97
98impl<F: Float + Send + Sync + 'static> PolynomialFeatures<F> {
99    /// Create a new `PolynomialFeatures` transformer.
100    ///
101    /// # Errors
102    ///
103    /// Returns [`FerroError::InvalidParameter`] if `degree == 0 && !include_bias`.
104    /// `degree == 0` is ALLOWED when `include_bias == true` — the output is a
105    /// single all-ones bias column (`powers_ == [[0,..,0]]`,
106    /// `n_output_features_ == 1`), mirroring scikit-learn, which only rejects the
107    /// `degree == 0 && include_bias == False` case
108    /// (`sklearn/preprocessing/_polynomial.py:326-330`: "Setting degree to zero
109    /// and include_bias to False would result in an empty output array.").
110    pub fn new(
111        degree: usize,
112        interaction_only: bool,
113        include_bias: bool,
114    ) -> Result<Self, FerroError> {
115        // sklearn `_polynomial.py:326-330`: ONLY reject degree==0 when
116        // include_bias is False (an empty output array); degree==0 with
117        // include_bias=True is valid and emits just the bias column.
118        if degree == 0 && !include_bias {
119            return Err(FerroError::InvalidParameter {
120                name: "degree".into(),
121                reason: "Setting degree to zero and include_bias to False would \
122                         result in an empty output array."
123                    .into(),
124            });
125        }
126        Ok(Self {
127            degree,
128            interaction_only,
129            include_bias,
130            _marker: std::marker::PhantomData,
131        })
132    }
133
134    /// Create a `PolynomialFeatures` with default settings:
135    /// degree=2, interaction_only=false, include_bias=true.
136    #[must_use]
137    pub fn default_config() -> Self {
138        Self {
139            degree: 2,
140            interaction_only: false,
141            include_bias: true,
142            _marker: std::marker::PhantomData,
143        }
144    }
145
146    /// Return the configured degree.
147    #[must_use]
148    pub fn degree(&self) -> usize {
149        self.degree
150    }
151
152    /// Return whether only interaction terms are generated.
153    #[must_use]
154    pub fn interaction_only(&self) -> bool {
155        self.interaction_only
156    }
157
158    /// Return whether a bias column is included.
159    #[must_use]
160    pub fn include_bias(&self) -> bool {
161        self.include_bias
162    }
163
164    /// Generate all combinations (with repetition unless `interaction_only`)
165    /// of feature indices up to `degree`.
166    ///
167    /// Returns a list of index-tuples, where each tuple specifies which
168    /// feature indices to multiply together to produce one output column.
169    fn feature_combinations(&self, n_features: usize) -> Vec<Vec<usize>> {
170        let mut combos: Vec<Vec<usize>> = Vec::new();
171
172        // Bias term: empty product = 1
173        if self.include_bias {
174            combos.push(vec![]);
175        }
176
177        // Generate combinations of degrees 1..=self.degree
178        let mut stack: Vec<(Vec<usize>, usize)> = Vec::new();
179
180        // Start with each feature at degree 1. For `degree == 0` there are NO
181        // degree>=1 combinations (sklearn `_combinations` ranges over
182        // `range(max(1, min_degree), max_degree + 1)` = `range(1, 1)` = empty,
183        // `_polynomial.py:215-217`), so only the bias empty-combo (if
184        // `include_bias`) is emitted.
185        if self.degree >= 1 {
186            for i in 0..n_features {
187                stack.push((vec![i], i));
188            }
189        }
190
191        while let Some((combo, last_idx)) = stack.pop() {
192            combos.push(combo.clone());
193
194            if combo.len() < self.degree {
195                // Extend with another feature
196                let start = if self.interaction_only {
197                    // Strictly increasing indices — no repeated features
198                    last_idx + 1
199                } else {
200                    // Non-decreasing indices — allows repeated features (pure powers)
201                    last_idx
202                };
203                for i in start..n_features {
204                    let mut new_combo = combo.clone();
205                    new_combo.push(i);
206                    stack.push((new_combo, i));
207                }
208            }
209        }
210
211        // Sort: bias first, then by combo length, then lexicographically
212        combos.sort_by(|a, b| a.len().cmp(&b.len()).then_with(|| a.cmp(b)));
213
214        combos
215    }
216}
217
218/// Run the shared `check_array` input validation (REQ-8) used by the stateless
219/// [`Transform::transform`], the stateful [`Fit::fit`], and the
220/// [`FittedPolynomialFeatures`] transform path, in sklearn's `check_array`
221/// order: zero-samples → zero-features → non-finite
222/// (`sklearn/utils/validation.py:1084`, `:1093`, `:1063`). Mirrors sklearn
223/// `PolynomialFeatures.fit`/`.transform` → `_validate_data`
224/// (`_polynomial.py:323`, `:433-435`), whose default `force_all_finite=True`
225/// REJECTS NaN/±inf.
226///
227/// `context` names the calling site for diagnostics (e.g.
228/// `"PolynomialFeatures::transform"` vs `"PolynomialFeatures::fit"`).
229fn validate_poly_input<F: Float>(x: &Array2<F>, context: &str) -> Result<(), FerroError> {
230    if x.nrows() == 0 {
231        return Err(FerroError::InsufficientSamples {
232            required: 1,
233            actual: 0,
234            context: context.into(),
235        });
236    }
237    if x.ncols() == 0 {
238        return Err(FerroError::InvalidParameter {
239            name: "X".to_string(),
240            reason: "Found array with 0 feature(s); a minimum of 1 is required \
241                     by PolynomialFeatures"
242                .to_string(),
243        });
244    }
245    if x.iter().any(|v| !v.is_finite()) {
246        return Err(FerroError::InvalidParameter {
247            name: "X".to_string(),
248            reason: "Input X contains non-finite values (NaN or infinity); \
249                     PolynomialFeatures requires all-finite input"
250                .to_string(),
251        });
252    }
253    Ok(())
254}
255
256/// Generate the dense polynomial feature matrix (REQ-1 value math), shared by
257/// the stateless [`Transform::transform`] and the stateful
258/// [`FittedPolynomialFeatures`] transform path so the two are byte-identical.
259///
260/// Builds the output columns DEGREE-BY-DEGREE using scikit-learn's incremental
261/// block recurrence (`_polynomial.py:525-564`), NOT an independent per-column
262/// left-fold: the degree-1 block is `X` (copied in feature order); each degree-`d`
263/// block is built feature-by-feature, where for `feature_idx` the new columns are
264/// the just-built degree-`(d-1)` columns in slice `[start, end)` each multiplied
265/// by `X[:, feature_idx]` — i.e. `previous_degree_column * X[:, feature_idx]` in
266/// THAT multiply order, matching `np.multiply(XP[:, start:end], X[:, feature_idx],
267/// casting="no")` bit-for-bit (FP multiplication is non-associative, so the
268/// association order is load-bearing for the last ULP on degree≥3 terms). The
269/// resulting column order is exactly sklearn's (bias, then degree-1, then each
270/// degree-`d` block by `feature_idx`), which equals the [`feature_combinations`]
271/// order the `powers_`/REQ-1 guards pin. The bias column (if `include_bias`) is
272/// `1`. The input is assumed already validated by [`validate_poly_input`].
273fn generate_poly_features<F: Float>(
274    x: &Array2<F>,
275    degree: usize,
276    interaction_only: bool,
277    include_bias: bool,
278) -> Array2<F> {
279    let n_samples = x.nrows();
280    let n_features = x.ncols();
281    let n_out = num_output_columns(n_features, degree, interaction_only, include_bias);
282    let mut out = Array2::zeros((n_samples, n_out));
283
284    // degree 0 (bias) term
285    let mut current_col = 0usize;
286    if include_bias {
287        for i in 0..n_samples {
288            out[[i, 0]] = F::one();
289        }
290        current_col = 1;
291    }
292
293    if n_features == 0 {
294        return out;
295    }
296
297    if degree == 0 {
298        // sklearn `_polynomial.py:532-533`: for `_max_degree == 0` the build
299        // returns right after the bias column — NO degree-1 block, NO higher
300        // blocks. degree==0 here implies include_bias (new() rejects
301        // degree==0 && !include_bias), so `out` already holds the single
302        // all-ones bias column.
303        return out;
304    }
305
306    // degree 1 block: copy X column-for-column.
307    // `index[j]` is the output column where degree-(d-1) block for feature `j`
308    // begins; the trailing `index[n_features]` marks one past the block end.
309    let mut index: Vec<usize> = Vec::with_capacity(n_features + 1);
310    for j in 0..n_features {
311        index.push(current_col);
312        for i in 0..n_samples {
313            out[[i, current_col]] = x[[i, j]];
314        }
315        current_col += 1;
316    }
317    index.push(current_col);
318
319    // degree >= 2 blocks, built incrementally from the previous degree block.
320    for _ in 2..=degree {
321        let mut new_index: Vec<usize> = Vec::with_capacity(n_features + 1);
322        // sklearn `end = index[-1]` (`_polynomial.py:544`): the LAST element of
323        // `index`, NOT `index[n_features]`. When `interaction_only` runs out of
324        // combinations at a high degree the inner loop `break`s early, leaving
325        // `index` (the previous `new_index`) with FEWER than `n_features+1`
326        // entries — so indexing `[n_features]` would panic (#2211, R-CODE-2). The
327        // last element is the correct degree-(d-1) block end either way.
328        let end = index.last().copied().unwrap_or(0);
329        for feature_idx in 0..n_features {
330            let mut start = index[feature_idx];
331            new_index.push(current_col);
332            if interaction_only {
333                start += index[feature_idx + 1] - index[feature_idx];
334            }
335            if end <= start {
336                // next_col <= current_col → no new columns for this feature.
337                break;
338            }
339            // new column = (stored degree-(d-1) column) * X[:, feature_idx],
340            // in that np.multiply arg order (prev FIRST, single feature SECOND).
341            for src_col in start..end {
342                for i in 0..n_samples {
343                    out[[i, current_col]] = out[[i, src_col]] * x[[i, feature_idx]];
344                }
345                current_col += 1;
346            }
347        }
348        new_index.push(current_col);
349        index = new_index;
350    }
351
352    out
353}
354
355/// Number of output columns for the incremental build (== number of
356/// [`feature_combinations`] entries / `n_output_features_`): the count of
357/// degree-`1..=degree` combinations (with repetition unless `interaction_only`)
358/// over `n_features` features, plus one for the bias if `include_bias`.
359fn num_output_columns(
360    n_features: usize,
361    degree: usize,
362    interaction_only: bool,
363    include_bias: bool,
364) -> usize {
365    let mut total = usize::from(include_bias);
366    if n_features == 0 {
367        return total;
368    }
369    if degree == 0 {
370        // sklearn `_combinations` emits no degree>=1 terms for degree==0
371        // (`range(max(1,0), 0+1)` = `range(1,1)` = empty); the only output
372        // column is the bias (`_polynomial.py:215-219`). degree==0 reaching here
373        // implies include_bias (new() rejects degree==0 && !include_bias).
374        return total;
375    }
376    // Count combinations per degree by mirroring the incremental block sizes.
377    // degree-1 block has `n_features` columns; subsequent block sizes follow the
378    // same start/end recurrence as the value build.
379    let mut index: Vec<usize> = (0..=n_features).collect();
380    total += n_features;
381    let mut current = n_features;
382    for _ in 2..=degree {
383        let mut new_index: Vec<usize> = Vec::with_capacity(n_features + 1);
384        // sklearn `index[-1]` (the last element), not `index[n_features]` —
385        // robust to the `interaction_only` high-degree early break (#2211).
386        let end = index.last().copied().unwrap_or(0);
387        for feature_idx in 0..n_features {
388            let mut start = index[feature_idx];
389            new_index.push(current);
390            if interaction_only {
391                start += index[feature_idx + 1] - index[feature_idx];
392            }
393            if end <= start {
394                break;
395            }
396            current += end - start;
397            total += end - start;
398        }
399        new_index.push(current);
400        index = new_index;
401    }
402    total
403}
404
405/// Build the `powers_` exponent matrix from a precomputed combination list
406/// (REQ-5). Returns an `Array2<usize>` of shape `(combos.len(), n_features)`
407/// where `powers_[k, j]` is the number of occurrences of input feature `j` in
408/// combination `k` (its exponent) — sklearn's
409/// `np.vstack([np.bincount(c, minlength=n_features_in_) for c in combinations])`
410/// (`_polynomial.py:262-264`). The bias empty-combo → an all-zeros row. Built
411/// from the SAME combinations the value math uses, so the row order equals the
412/// output column order.
413fn build_powers(combos: &[Vec<usize>], n_features: usize) -> Array2<usize> {
414    let mut powers = Array2::<usize>::zeros((combos.len(), n_features));
415    for (k, combo) in combos.iter().enumerate() {
416        for &j in combo {
417            powers[[k, j]] += 1;
418        }
419    }
420    powers
421}
422
423impl<F: Float + Send + Sync + 'static> Default for PolynomialFeatures<F> {
424    fn default() -> Self {
425        Self::default_config()
426    }
427}
428
429// ---------------------------------------------------------------------------
430// Trait implementations
431// ---------------------------------------------------------------------------
432
433impl<F: Float + Send + Sync + 'static> Transform<Array2<F>> for PolynomialFeatures<F> {
434    type Output = Array2<F>;
435    type Error = FerroError;
436
437    /// Generate polynomial and interaction features.
438    ///
439    /// # Errors
440    ///
441    /// Mirrors scikit-learn's `PolynomialFeatures.transform`
442    /// (`sklearn/preprocessing/_polynomial.py:433`, `self._validate_data(...)` ->
443    /// `check_array`), validating in `check_array` order
444    /// (samples -> features -> finite):
445    ///
446    /// - Returns [`FerroError::InsufficientSamples`] if `x` has zero rows
447    ///   (`check_array` `ensure_min_samples=1`; sklearn raises `ValueError:
448    ///   Found array with 0 sample(s) ... a minimum of 1 is required`).
449    /// - Returns [`FerroError::InvalidParameter`] if `x` has zero columns
450    ///   (`check_array` `ensure_min_features=1`; sklearn raises `ValueError:
451    ///   Found array with 0 feature(s) ... a minimum of 1 is required`).
452    /// - Returns [`FerroError::InvalidParameter`] if `x` contains any non-finite
453    ///   value (NaN, +inf, or -inf; `check_array` `force_all_finite=True`;
454    ///   sklearn raises `ValueError: Input X contains NaN` / `infinity ...`).
455    fn transform(&self, x: &Array2<F>) -> Result<Array2<F>, FerroError> {
456        validate_poly_input(x, "PolynomialFeatures::transform")?;
457        Ok(generate_poly_features(
458            x,
459            self.degree,
460            self.interaction_only,
461            self.include_bias,
462        ))
463    }
464}
465
466// ---------------------------------------------------------------------------
467// FittedPolynomialFeatures (sklearn stateful `fit` -> fitted estimator path)
468// ---------------------------------------------------------------------------
469
470/// A fitted [`PolynomialFeatures`].
471///
472/// `PolynomialFeatures.fit` (sklearn `_polynomial.py:306-400`, "Compute number
473/// of output features") learns NO numeric statistics — it records only the
474/// shape metadata derivable from `n_features_in_` and the configured params:
475/// `n_features_in_`, `n_output_features_` (the number of output columns), and
476/// the `powers_` exponent matrix. The fitted type's [`Transform::transform`]
477/// REUSES the very same combination enumeration ([`PolynomialFeatures::
478/// feature_combinations`]) and value math ([`generate_poly_features`]) as the
479/// stateless [`PolynomialFeatures`] path, so the two paths are bit-identical;
480/// it additionally enforces the transform-time feature-count check against
481/// `n_features_in_` (sklearn `X has N features, but ... expecting M`).
482#[derive(Debug, Clone)]
483pub struct FittedPolynomialFeatures<F> {
484    /// Maximum polynomial degree (carried from the unfitted [`PolynomialFeatures`]).
485    pub(crate) degree: usize,
486    /// If `true`, only interaction terms are produced (no pure powers).
487    pub(crate) interaction_only: bool,
488    /// If `true`, a bias (constant ones) column is prepended.
489    pub(crate) include_bias: bool,
490    /// Number of features (columns) seen during [`Fit::fit`] — sklearn's
491    /// `n_features_in_` (`_polynomial.py:323`, set by `_validate_data`).
492    pub(crate) n_features_in_: usize,
493    /// Total number of polynomial output columns — sklearn's
494    /// `n_output_features_` (`_polynomial.py:362`, == `_num_combinations`).
495    pub(crate) n_output_features_: usize,
496    /// The exponent matrix of shape `(n_output_features_, n_features_in_)`:
497    /// `powers_[i, j]` is the exponent of input feature `j` in output feature
498    /// `i` — sklearn's `powers_` (`_polynomial.py:250-264`). The bias empty-combo
499    /// row is all zeros.
500    pub(crate) powers_: Array2<usize>,
501    _marker: std::marker::PhantomData<F>,
502}
503
504impl<F: Float + Send + Sync + 'static> FittedPolynomialFeatures<F> {
505    /// Return the number of features (columns) seen during [`Fit::fit`].
506    ///
507    /// Mirrors scikit-learn's `PolynomialFeatures.n_features_in_`
508    /// (`_polynomial.py:323`).
509    #[must_use]
510    pub fn n_features_in(&self) -> usize {
511        self.n_features_in_
512    }
513
514    /// Return the total number of polynomial output columns.
515    ///
516    /// Mirrors scikit-learn's `PolynomialFeatures.n_output_features_`
517    /// (`_polynomial.py:362`).
518    #[must_use]
519    pub fn n_output_features(&self) -> usize {
520        self.n_output_features_
521    }
522
523    /// Return the `powers_` exponent matrix of shape
524    /// `(n_output_features_, n_features_in_)`.
525    ///
526    /// `powers_[i, j]` is the exponent of input feature `j` in output feature
527    /// `i`, mirroring scikit-learn's `PolynomialFeatures.powers_`
528    /// (`_polynomial.py:250-264`).
529    #[must_use]
530    pub fn powers(&self) -> &Array2<usize> {
531        &self.powers_
532    }
533
534    /// Return the configured degree.
535    #[must_use]
536    pub fn degree(&self) -> usize {
537        self.degree
538    }
539
540    /// Return whether only interaction terms are generated.
541    #[must_use]
542    pub fn interaction_only(&self) -> bool {
543        self.interaction_only
544    }
545
546    /// Return whether a bias column is included.
547    #[must_use]
548    pub fn include_bias(&self) -> bool {
549        self.include_bias
550    }
551}
552
553impl<F: Float + Send + Sync + 'static> Fit<Array2<F>, ()> for PolynomialFeatures<F> {
554    type Fitted = FittedPolynomialFeatures<F>;
555    type Error = FerroError;
556
557    /// Validate the input and record `n_features_in_` / `n_output_features_` /
558    /// `powers_`, returning a [`FittedPolynomialFeatures`].
559    ///
560    /// `PolynomialFeatures.fit` (sklearn `_polynomial.py:306-400`, "Compute
561    /// number of output features") learns NO numeric statistics. This runs the
562    /// SAME REQ-8 `check_array` validation as [`Transform::transform`] (via the
563    /// shared [`validate_poly_input`] helper; sklearn's `_validate_data` default
564    /// `force_all_finite=True` REJECTS NaN/±inf), enumerates the polynomial
565    /// combinations by REUSING [`PolynomialFeatures::feature_combinations`] (no
566    /// reimplementation), and records `n_features_in_ = x.ncols()`,
567    /// `n_output_features_ = combinations.len()` (== sklearn's
568    /// `_num_combinations`, `:362`), and the `powers_` exponent matrix
569    /// (`build_powers` = sklearn's `np.bincount` over the combinations, `:262`).
570    /// `powers_` is built from the SAME combinations the transform value math
571    /// uses, so its row order equals the output column order.
572    ///
573    /// # Errors
574    ///
575    /// Returns [`FerroError::InsufficientSamples`] for zero rows and
576    /// [`FerroError::InvalidParameter`] for zero features or any non-finite
577    /// value (NaN, +inf, -inf) — matching `check_array`
578    /// (`sklearn/utils/validation.py:1084`, `:1093`, `:1063`) as routed through
579    /// `PolynomialFeatures.fit` → `_validate_data` (`_polynomial.py:323`).
580    fn fit(&self, x: &Array2<F>, _y: &()) -> Result<FittedPolynomialFeatures<F>, FerroError> {
581        validate_poly_input(x, "PolynomialFeatures::fit")?;
582        let n_features_in_ = x.ncols();
583        // REUSE the SHIPPED combination enumeration (REQ-1) — do NOT reimplement.
584        let combos = self.feature_combinations(n_features_in_);
585        let n_output_features_ = combos.len();
586        let powers_ = build_powers(&combos, n_features_in_);
587        Ok(FittedPolynomialFeatures {
588            degree: self.degree,
589            interaction_only: self.interaction_only,
590            include_bias: self.include_bias,
591            n_features_in_,
592            n_output_features_,
593            powers_,
594            _marker: std::marker::PhantomData,
595        })
596    }
597}
598
599impl<F: Float + Send + Sync + 'static> Transform<Array2<F>> for FittedPolynomialFeatures<F> {
600    type Output = Array2<F>;
601    type Error = FerroError;
602
603    /// Generate polynomial features, delegating to the SAME combination
604    /// enumeration + value math as the stateless [`PolynomialFeatures`] path.
605    ///
606    /// First applies the REQ-8 `check_array` guards (min-samples / min-features
607    /// / finite) and generates the polynomial matrix, THEN validates that `x`
608    /// has the same number of columns recorded during [`Fit::fit`]. This ORDER
609    /// matches sklearn's `_validate_data(reset=False)`, which runs `check_array`
610    /// BEFORE the `n_features_in_` consistency check (`_polynomial.py:433-435`,
611    /// #2207): a NaN / ±inf / zero-sample / zero-feature input raises its
612    /// `check_array` error EVEN when the column count is also wrong. Only after
613    /// that does the feature-count comparison fire. Because the combinations are
614    /// re-enumerated from the SAME params (degree / interaction_only /
615    /// include_bias) and the value math is the shared [`generate_poly_features`],
616    /// the output is byte-identical to `PolynomialFeatures::transform`.
617    ///
618    /// # Errors
619    ///
620    /// Returns [`FerroError::ShapeMismatch`] if the column count differs from
621    /// `n_features_in_` (sklearn `ValueError: X has N features, but
622    /// PolynomialFeatures is expecting M features as input.`,
623    /// `_polynomial.py:402-435`). Returns [`FerroError::InsufficientSamples`]
624    /// for zero rows and [`FerroError::InvalidParameter`] for zero features or
625    /// any non-finite value (REQ-8, via [`validate_poly_input`]).
626    fn transform(&self, x: &Array2<F>) -> Result<Array2<F>, FerroError> {
627        // sklearn `_validate_data(reset=False)` runs `check_array` (finite /
628        // min-samples / min-features) BEFORE the `n_features_in_` check (#2207).
629        // So validate + generate FIRST; a NaN / ±inf / zero-sample / zero-feature
630        // input must raise its check_array error EVEN when the column count is
631        // also wrong. Only then does the feature-count comparison fire.
632        validate_poly_input(x, "FittedPolynomialFeatures::transform")?;
633        // REUSE the SHARED incremental value math — build from the configured
634        // params so the column order matches the stateless path (and `powers_`)
635        // exactly.
636        let out = generate_poly_features(x, self.degree, self.interaction_only, self.include_bias);
637        if x.ncols() != self.n_features_in_ {
638            return Err(FerroError::ShapeMismatch {
639                expected: vec![x.nrows(), self.n_features_in_],
640                actual: vec![x.nrows(), x.ncols()],
641                context: "FittedPolynomialFeatures::transform".into(),
642            });
643        }
644        Ok(out)
645    }
646}
647
648// ---------------------------------------------------------------------------
649// Pipeline integration (generic)
650// ---------------------------------------------------------------------------
651
652impl<F: Float + Send + Sync + 'static> PipelineTransformer<F> for PolynomialFeatures<F> {
653    /// Fit the polynomial features transformer using the pipeline interface.
654    ///
655    /// Because `PolynomialFeatures` is stateless, this simply boxes `self`
656    /// as a [`FittedPipelineTransformer`].
657    ///
658    /// # Errors
659    ///
660    /// This implementation never returns an error.
661    fn fit_pipeline(
662        &self,
663        _x: &Array2<F>,
664        _y: &Array1<F>,
665    ) -> Result<Box<dyn FittedPipelineTransformer<F>>, FerroError> {
666        Ok(Box::new(self.clone()))
667    }
668}
669
670impl<F: Float + Send + Sync + 'static> FittedPipelineTransformer<F> for PolynomialFeatures<F> {
671    /// Transform data using the pipeline interface.
672    ///
673    /// # Errors
674    ///
675    /// Propagates errors from [`Transform::transform`].
676    fn transform_pipeline(&self, x: &Array2<F>) -> Result<Array2<F>, FerroError> {
677        self.transform(x)
678    }
679}
680
681// ---------------------------------------------------------------------------
682// Tests
683// ---------------------------------------------------------------------------
684
685#[cfg(test)]
686mod tests {
687    use super::*;
688    use approx::assert_abs_diff_eq;
689    use ndarray::array;
690
691    #[test]
692    fn test_degree2_two_features_with_bias() {
693        // degree=2, interaction_only=false, include_bias=true
694        // Expected: [1, a, b, a², a·b, b²]
695        let poly = PolynomialFeatures::<f64>::new(2, false, true).unwrap();
696        let x = array![[2.0, 3.0]];
697        let out = poly.transform(&x).unwrap();
698        assert_eq!(out.shape()[0], 1);
699        assert_eq!(out.shape()[1], 6); // 1 + 2 + 3 combinations
700        assert_abs_diff_eq!(out[[0, 0]], 1.0, epsilon = 1e-10); // bias
701        assert_abs_diff_eq!(out[[0, 1]], 2.0, epsilon = 1e-10); // a
702        assert_abs_diff_eq!(out[[0, 2]], 3.0, epsilon = 1e-10); // b
703        assert_abs_diff_eq!(out[[0, 3]], 4.0, epsilon = 1e-10); // a²
704        assert_abs_diff_eq!(out[[0, 4]], 6.0, epsilon = 1e-10); // a·b
705        assert_abs_diff_eq!(out[[0, 5]], 9.0, epsilon = 1e-10); // b²
706    }
707
708    #[test]
709    fn test_degree2_interaction_only() {
710        // degree=2, interaction_only=true, include_bias=true
711        // Expected: [1, a, b, a·b]
712        let poly = PolynomialFeatures::<f64>::new(2, true, true).unwrap();
713        let x = array![[2.0, 3.0]];
714        let out = poly.transform(&x).unwrap();
715        assert_eq!(out.shape()[1], 4);
716        assert_abs_diff_eq!(out[[0, 0]], 1.0, epsilon = 1e-10); // bias
717        assert_abs_diff_eq!(out[[0, 1]], 2.0, epsilon = 1e-10); // a
718        assert_abs_diff_eq!(out[[0, 2]], 3.0, epsilon = 1e-10); // b
719        assert_abs_diff_eq!(out[[0, 3]], 6.0, epsilon = 1e-10); // a·b
720    }
721
722    #[test]
723    fn test_no_bias() {
724        // degree=2, interaction_only=false, include_bias=false
725        // Expected: [a, b, a², a·b, b²]
726        let poly = PolynomialFeatures::<f64>::new(2, false, false).unwrap();
727        let x = array![[2.0, 3.0]];
728        let out = poly.transform(&x).unwrap();
729        assert_eq!(out.shape()[1], 5);
730        assert_abs_diff_eq!(out[[0, 0]], 2.0, epsilon = 1e-10); // a
731    }
732
733    #[test]
734    fn test_degree1_only_linear() {
735        let poly = PolynomialFeatures::<f64>::new(1, false, true).unwrap();
736        let x = array![[2.0, 3.0]];
737        let out = poly.transform(&x).unwrap();
738        // [1, a, b]
739        assert_eq!(out.shape()[1], 3);
740        assert_abs_diff_eq!(out[[0, 0]], 1.0, epsilon = 1e-10);
741        assert_abs_diff_eq!(out[[0, 1]], 2.0, epsilon = 1e-10);
742        assert_abs_diff_eq!(out[[0, 2]], 3.0, epsilon = 1e-10);
743    }
744
745    #[test]
746    fn test_multiple_rows() {
747        let poly = PolynomialFeatures::<f64>::new(2, false, false).unwrap();
748        let x = array![[1.0, 2.0], [3.0, 4.0]];
749        let out = poly.transform(&x).unwrap();
750        assert_eq!(out.shape(), &[2, 5]);
751        // Row 0: a=1, b=2 → [1, 2, 1, 2, 4]
752        assert_abs_diff_eq!(out[[0, 0]], 1.0, epsilon = 1e-10);
753        assert_abs_diff_eq!(out[[0, 1]], 2.0, epsilon = 1e-10);
754        assert_abs_diff_eq!(out[[0, 2]], 1.0, epsilon = 1e-10);
755        assert_abs_diff_eq!(out[[0, 3]], 2.0, epsilon = 1e-10);
756        assert_abs_diff_eq!(out[[0, 4]], 4.0, epsilon = 1e-10);
757    }
758
759    #[test]
760    fn test_single_feature_degree2() {
761        // [a] → [1, a, a²]
762        let poly = PolynomialFeatures::<f64>::new(2, false, true).unwrap();
763        let x = array![[3.0]];
764        let out = poly.transform(&x).unwrap();
765        assert_eq!(out.shape()[1], 3);
766        assert_abs_diff_eq!(out[[0, 0]], 1.0, epsilon = 1e-10);
767        assert_abs_diff_eq!(out[[0, 1]], 3.0, epsilon = 1e-10);
768        assert_abs_diff_eq!(out[[0, 2]], 9.0, epsilon = 1e-10);
769    }
770
771    #[test]
772    fn test_degree0_include_bias_false_is_err() {
773        // sklearn `_polynomial.py:326-330`: degree==0 with include_bias=False is
774        // the ONLY degree==0 case that raises ("Setting degree to zero and
775        // include_bias to False would result in an empty output array.").
776        assert!(PolynomialFeatures::<f64>::new(0, false, false).is_err());
777        assert!(PolynomialFeatures::<f64>::new(0, true, false).is_err());
778    }
779
780    #[test]
781    fn test_degree0_include_bias_true_bias_only() -> Result<(), FerroError> {
782        // sklearn ORACLE (degree=0, include_bias=True): a single all-ones bias
783        // column, powers_ == [[0, 0]] (one all-zeros row), n_output_features_ == 1.
784        //   python3 -c "import numpy as np; from sklearn.preprocessing import \
785        //   PolynomialFeatures as P; m=P(degree=0, include_bias=True).fit( \
786        //   np.array([[2.,3.]])); print(m.transform(np.array([[2.,3.]])).tolist(), \
787        //   m.powers_.tolist(), m.n_output_features_)"
788        //   -> [[1.0]] [[0, 0]] 1
789        let poly = PolynomialFeatures::<f64>::new(0, false, true)?;
790        let x = array![[2.0, 3.0]];
791
792        // Stateless transform → single all-ones bias column.
793        let out = poly.transform(&x)?;
794        assert_eq!(out.shape(), &[1, 1]);
795        assert_abs_diff_eq!(out[[0, 0]], 1.0, epsilon = 1e-10);
796
797        // Fitted attributes: powers_ == [[0, 0]], n_output_features_ == 1.
798        let fitted = poly.fit(&x, &())?;
799        assert_eq!(fitted.n_output_features(), 1);
800        assert_eq!(fitted.powers().shape(), &[1, 2]);
801        assert_eq!(fitted.powers()[[0, 0]], 0);
802        assert_eq!(fitted.powers()[[0, 1]], 0);
803
804        // Fitted transform is byte-identical to the stateless path.
805        let out_fitted = fitted.transform(&x)?;
806        assert_eq!(out_fitted, out);
807        Ok(())
808    }
809
810    #[test]
811    fn test_default_config() {
812        let poly = PolynomialFeatures::<f64>::default();
813        assert_eq!(poly.degree(), 2);
814        assert!(!poly.interaction_only());
815        assert!(poly.include_bias());
816    }
817
818    #[test]
819    fn test_pipeline_integration() {
820        use ferrolearn_core::pipeline::PipelineTransformer;
821        let poly = PolynomialFeatures::<f64>::new(2, false, true).unwrap();
822        let x = array![[1.0, 2.0], [3.0, 4.0]];
823        let y = Array1::zeros(2);
824        let fitted = poly.fit_pipeline(&x, &y).unwrap();
825        let result = fitted.transform_pipeline(&x).unwrap();
826        assert_eq!(result.shape(), &[2, 6]);
827    }
828
829    #[test]
830    fn test_degree3_single_feature() {
831        // [a] with degree=3, no bias → [a, a², a³]
832        let poly = PolynomialFeatures::<f64>::new(3, false, false).unwrap();
833        let x = array![[2.0]];
834        let out = poly.transform(&x).unwrap();
835        assert_eq!(out.shape()[1], 3);
836        assert_abs_diff_eq!(out[[0, 0]], 2.0, epsilon = 1e-10);
837        assert_abs_diff_eq!(out[[0, 1]], 4.0, epsilon = 1e-10);
838        assert_abs_diff_eq!(out[[0, 2]], 8.0, epsilon = 1e-10);
839    }
840}