Skip to main content

Module preprocessing

Module preprocessing 

Source
Expand description

Feature preprocessing (scalers, PCA).

Modules§

binarizer
cca
Canonical Correlation Analysis.
fast_ica
FastICA — fixed-point Independent Component Analysis with deflation.
kbins_discretizer
kernel_pca
Kernel PCA.
label_encoder
max_abs_scaler
minmax_scaler
mutual_information
nmf
Non-negative Matrix Factorisation.
normalizer
one_hot_encoder
ordinal_encoder
pca
pls
Partial Least Squares Regression (PLS1).
polynomial_features
power_transformer
quantile_transformer
rfe
Recursive Feature Elimination (RFE).
robust_scaler
select_from_model
select_k_best
simple_imputer
standard_scaler
truncated_svd
Truncated SVD (a.k.a. LSA when applied to a term-document matrix).
variance_threshold

Structs§

Binarizer
Parameters for Binarizer (unfitted state).
Cca
FastIca
FittedBinarizer
Fitted Binarizer — stateless, stores only the threshold.
FittedCca
FittedFastIca
FittedKBinsDiscretizer
Fitted KBinsDiscretizer – holds bin edges per feature.
FittedKernelPca
FittedLabelEncoder
Fitted LabelEncoder — holds the learned vocabulary and mapping.
FittedMaxAbsScaler
Fitted MaxAbsScaler — holds the maximum absolute value per feature.
FittedMinMaxScaler
Fitted MinMaxScaler — holds learned min/max per feature.
FittedMutualInformationSelector
Fitted MutualInformationSelector — holds per-feature MI scores and the indices of the selected top-k features.
FittedNmf
FittedNormalizer
Fitted Normalizer — stateless (fit is a validation-only no-op).
FittedOneHotEncoder
Fitted OneHotEncoder — holds the number of unique categories per column.
FittedOrdinalEncoder
Fitted OrdinalEncoder — holds per-column vocabularies and mappings.
FittedPca
Fitted PCA — holds learned principal components, explained variance, and mean.
FittedPlsRegression
FittedPolynomialFeatures
Fitted PolynomialFeatures — stores the number of input features.
FittedPowerTransformer
Fitted PowerTransformer – holds learned lambdas per feature and optional standardization parameters (mean and std).
FittedQuantileTransformer
Fitted QuantileTransformer – holds quantile references per feature.
FittedRfe
FittedRobustScaler
Fitted RobustScaler — holds learned median and IQR per feature.
FittedSelectFromModel
Fitted SelectFromModel – holds the original importances and the indices of the selected features.
FittedSelectKBest
Fitted SelectKBest – holds per-feature scores and the indices of the selected top-k features.
FittedSequentialFeatureSelector
FittedSimpleImputer
Fitted SimpleImputer — holds one fill value per column.
FittedStandardScaler
Fitted StandardScaler — holds learned mean and std per feature.
FittedTruncatedSvd
FittedVarianceThreshold
Fitted VarianceThreshold — holds learned per-feature variances and the indices of features that exceeded the threshold.
KBinsDiscretizer
Parameters for KBinsDiscretizer (unfitted state).
KernelPca
LabelEncoder
Encodes string labels as integer indices.
MaxAbsScaler
Parameters for MaxAbsScaler (unfitted state).
MinMaxScaler
Parameters for MinMaxScaler (unfitted state).
MutualInformationSelector
Parameters for MutualInformationSelector (unfitted state).
Nmf
Normalizer
Parameters for Normalizer (unfitted state).
OneHotEncoder
One-hot encoder for integer-encoded categorical features.
OrdinalEncoder
Encodes string categories as ordinal (integer) values per column.
Pca
Parameters for PCA (unfitted state).
PlsRegression
PolynomialFeatures
Generates polynomial and interaction features.
PowerTransformer
Parameters for PowerTransformer (unfitted state).
QuantileTransformer
Parameters for QuantileTransformer (unfitted state).
Rfe
RobustScaler
Parameters for RobustScaler (unfitted state).
SelectFromModel
Parameters for SelectFromModel feature selector (unfitted state).
SelectKBest
Parameters for SelectKBest feature selector (unfitted state).
SequentialFeatureSelector
SimpleImputer
Parameters for SimpleImputer (unfitted state).
StandardScaler
Parameters for StandardScaler (unfitted state).
TruncatedSvd
VarianceThreshold
Parameters for VarianceThreshold feature selector (unfitted state).

Enums§

BinStrategy
Strategy for computing bin edges.
EncodeStrategy
Encoding strategy for transformed output.
ImputeStrategy
Strategy used to compute the fill value for missing (NaN) entries.
KpcaKernel
NormType
The type of norm used to normalize each sample (row).
OutputDistribution
Output distribution for the quantile transformer.