ferrolearn-preprocess

Data preprocessing transformers for the ferrolearn machine learning framework.

Scalers

Transformer	Description
`StandardScaler`	Zero-mean, unit-variance scaling
`MinMaxScaler`	Scale features to a given range (default [0, 1])
`RobustScaler`	Median/IQR-based scaling, robust to outliers
`MaxAbsScaler`	Scale by maximum absolute value to [-1, 1]
`Normalizer`	Normalize each sample (row) to unit norm
`PowerTransformer`	Yeo-Johnson power transform for Gaussian-like distributions

Encoders

Transformer	Description
`OneHotEncoder`	Encode categorical columns as binary indicator columns
`OrdinalEncoder`	Map string categories to integers by order of appearance
`LabelEncoder`	Map string labels to integer indices

Imputers

Transformer	Description
`SimpleImputer`	Fill missing (NaN) values using mean, median, most frequent, or constant

Feature selection

Transformer	Description
`VarianceThreshold`	Remove features with variance below a threshold
`SelectKBest`	Keep the K features with highest ANOVA F-scores
`SelectFromModel`	Keep features whose model-derived importance exceeds a threshold

Feature engineering

Transformer	Description
`PolynomialFeatures`	Generate polynomial and interaction features
`Binarizer`	Threshold features to binary values
`FunctionTransformer`	Apply a user-provided function element-wise
`ColumnTransformer`	Apply different transformers to different column subsets

Example

use ferrolearn_preprocess::StandardScaler;
use ferrolearn_core::FitTransform;
use ndarray::array;

let x = array![[1.0_f64, 10.0], [2.0, 20.0], [3.0, 30.0]];
let scaled = StandardScaler::<f64>::new().fit_transform(&x).unwrap();
// Each column now has mean ~= 0 and std ~= 1

All transformers implement PipelineTransformer for use inside a Pipeline.

License

Licensed under either of Apache License, Version 2.0 or MIT License at your option.

ferrolearn-preprocess 0.1.2