Expand description
Streaming preprocessing utilities for feature transformation.
These transformers process features incrementally, maintaining running statistics that update with each sample – no batch recomputation needed.
§Modules
| Module | Purpose |
|---|---|
normalizer | Welford-based online standardization (zero-mean, unit-variance) |
feature_selector | EWMA importance tracking with dynamic feature masking |
ccipca | Candid Covariance-free Incremental PCA – streaming dimensionality reduction |
feature_hasher | Feature hashing (hashing trick) for fixed-size dimensionality reduction |
min_max | Streaming min-max scaler for feature normalization to a target range |
one_hot | Streaming one-hot encoder with online category discovery |
target_encoder | Streaming target encoder with Bayesian smoothing |
polynomial | Polynomial and interaction feature generation |
§Example
use irithyll::preprocessing::{IncrementalNormalizer, OnlineFeatureSelector};
let mut norm = IncrementalNormalizer::new();
let standardized = norm.update_and_transform(&[100.0, 0.5, -3.0]);
let mut selector = OnlineFeatureSelector::new(3, 0.5, 0.1, 10);
selector.update_importances(&[0.9, 0.1, 0.8]);
let masked = selector.mask_features(&standardized);Re-exports§
pub use ccipca::CCIPCA;pub use feature_hasher::FeatureHasher;pub use feature_selector::OnlineFeatureSelector;pub use min_max::MinMaxScaler;pub use normalizer::IncrementalNormalizer;pub use one_hot::OneHotEncoder;pub use polynomial::PolynomialFeatures;pub use target_encoder::TargetEncoder;pub use crate::pipeline::StreamingPreprocessor;
Modules§
- ccipca
- Candid Covariance-free Incremental PCA (CCIPCA) for streaming dimensionality reduction.
- feature_
hasher - Feature hashing (hashing trick) for dimensionality reduction.
- feature_
selector - EWMA-based online feature selector with dynamic importance masking.
- min_max
- Streaming min-max scaler for feature normalization.
- normalizer
- Welford online mean/variance normalizer for incremental standardization.
- one_hot
- Streaming one-hot encoder for categorical features.
- polynomial
- Degree-2 polynomial feature generation for interaction modeling.
- target_
encoder - Streaming target encoder for categorical features with Bayesian smoothing.