Crate scirs2_transform

Crate scirs2_transform 

Source
Expand description

ยงSciRS2 Transform - Data Transformation and Preprocessing

scirs2-transform provides comprehensive data transformation utilities for machine learning, offering normalization, feature engineering, dimensionality reduction, encoding, imputation, and pipelines with SIMD acceleration and out-of-core processing for large datasets.

ยง๐ŸŽฏ Key Features

  • Normalization: Min-max, Z-score, robust scaling, quantile normalization
  • Feature Engineering: Polynomial features, interaction terms, binning
  • Dimensionality Reduction: PCA, SVD, t-SNE, UMAP, LDA
  • Encoding: One-hot, label, ordinal, target encoding
  • Imputation: Mean, median, mode, KNN, iterative imputation
  • Pipelines: Chained transformations with fit/transform API
  • Performance: SIMD operations, streaming, out-of-core processing

ยง๐Ÿ“ฆ Module Overview

SciRS2 Modulescikit-learn EquivalentDescription
normalizesklearn.preprocessing.StandardScalerData normalization/standardization
featuressklearn.preprocessing.PolynomialFeaturesFeature engineering
reductionsklearn.decomposition.PCADimensionality reduction
encodingsklearn.preprocessing.OneHotEncoderCategorical encoding
imputesklearn.impute.SimpleImputerMissing value imputation
pipelinesklearn.pipeline.PipelineTransformation pipelines

ยง๐Ÿš€ Quick Start

[dependencies]
scirs2-transform = "0.1.0-rc.1"
use scirs2_transform::normalize::{normalize_array, NormalizationMethod};
use scirs2_core::ndarray::Array2;

// Standardize data (Z-score normalization)
let data = Array2::<f64>::zeros((100, 5));
let normalized = normalize_array(&data, NormalizationMethod::ZScore, 0).unwrap();

ยง๐Ÿ”’ Version: 0.1.0-rc.1 (October 03, 2025)

Re-exportsยง

pub use decomposition::DictionaryLearning;
pub use decomposition::NMF;
pub use encoding::BinaryEncoder;
pub use encoding::EncodedOutput;
pub use encoding::FrequencyEncoder;
pub use encoding::OneHotEncoder;
pub use encoding::OrdinalEncoder;
pub use encoding::SparseMatrix;
pub use encoding::TargetEncoder;
pub use encoding::WOEEncoder;
pub use error::Result;
pub use error::TransformError;
pub use features::binarize;
pub use features::discretize_equal_frequency;
pub use features::discretize_equal_width;
pub use features::log_transform;
pub use features::power_transform;
pub use features::PolynomialFeatures;
pub use features::PowerTransformer;
pub use impute::DistanceMetric;
pub use impute::ImputeStrategy;
pub use impute::IterativeImputer;
pub use impute::KNNImputer;
pub use impute::MissingIndicator;
pub use impute::SimpleImputer;
pub use impute::WeightingScheme;
pub use normalize::normalize_array;
pub use normalize::normalize_vector;
pub use normalize::NormalizationMethod;
pub use normalize::Normalizer;
pub use pipeline::make_column_transformer;
pub use pipeline::make_pipeline;
pub use pipeline::ColumnTransformer;
pub use pipeline::Pipeline;
pub use pipeline::RemainderOption;
pub use pipeline::Transformer;
pub use reduction::trustworthiness;
pub use reduction::AffinityMethod;
pub use reduction::Isomap;
pub use reduction::SpectralEmbedding;
pub use reduction::TruncatedSVD;
pub use reduction::LDA;
pub use reduction::LLE;
pub use reduction::PCA;
pub use reduction::TSNE;
pub use reduction::UMAP;
pub use scaling::MaxAbsScaler;
pub use scaling::QuantileTransformer;
pub use selection::MutualInfoSelector;
pub use selection::RecursiveFeatureElimination;
pub use selection::VarianceThreshold;
pub use time_series::FourierFeatures;
pub use time_series::LagFeatures;
pub use time_series::TimeSeriesFeatures;
pub use time_series::WaveletFeatures;
pub use graph::adjacency_to_edge_list;
pub use graph::edge_list_to_adjacency;
pub use graph::ActivationType;
pub use graph::DeepWalk;
pub use graph::GraphAutoencoder;
pub use graph::LaplacianType;
pub use graph::Node2Vec;
pub use image::resize_images;
pub use image::rgb_to_grayscale;
pub use image::BlockNorm;
pub use image::HOGDescriptor;
pub use image::ImageNormMethod;
pub use image::ImageNormalizer;
pub use image::PatchExtractor;
pub use optimization_config::AdaptiveParameterTuner;
pub use optimization_config::AdvancedConfigOptimizer;
pub use optimization_config::AutoTuner;
pub use optimization_config::ConfigurationPredictor;
pub use optimization_config::DataCharacteristics;
pub use optimization_config::OptimizationConfig;
pub use optimization_config::OptimizationReport;
pub use optimization_config::PerformanceMetric;
pub use optimization_config::SystemMonitor;
pub use optimization_config::SystemResources;
pub use optimization_config::TransformationRecommendation;
pub use out_of_core::csv_chunks;
pub use out_of_core::ChunkedArrayReader;
pub use out_of_core::ChunkedArrayWriter;
pub use out_of_core::OutOfCoreConfig;
pub use out_of_core::OutOfCoreNormalizer;
pub use out_of_core::OutOfCoreTransformer;
pub use performance::EnhancedPCA;
pub use performance::EnhancedStandardScaler;
pub use streaming::OutlierMethod;
pub use streaming::StreamingFeatureSelector;
pub use streaming::StreamingMinMaxScaler;
pub use streaming::StreamingOutlierDetector;
pub use streaming::StreamingPCA;
pub use streaming::StreamingQuantileTracker;
pub use streaming::StreamingStandardScaler;
pub use streaming::StreamingTransformer;
pub use streaming::WindowedStreamingTransformer;
pub use text::CountVectorizer;
pub use text::HashingVectorizer;
pub use text::StreamingCountVectorizer;
pub use text::TfidfVectorizer;
pub use utils::ArrayMemoryPool;
pub use utils::DataChunker;
pub use utils::PerfUtils;
pub use utils::ProcessingStrategy;
pub use utils::StatUtils;
pub use utils::TypeConverter;
pub use utils::ValidationUtils;
pub use auto_feature_engineering::AdvancedMetaLearningSystem;
pub use auto_feature_engineering::AutoFeatureEngineer;
pub use auto_feature_engineering::DatasetMetaFeatures;
pub use auto_feature_engineering::EnhancedMetaFeatures;
pub use auto_feature_engineering::MultiObjectiveRecommendation;
pub use auto_feature_engineering::TransformationConfig;
pub use auto_feature_engineering::TransformationType;
pub use quantum_optimization::AdvancedQuantumMetrics;
pub use quantum_optimization::AdvancedQuantumOptimizer;
pub use quantum_optimization::AdvancedQuantumParams;
pub use quantum_optimization::QuantumHyperparameterTuner;
pub use quantum_optimization::QuantumInspiredOptimizer;
pub use quantum_optimization::QuantumParticle;
pub use quantum_optimization::QuantumTransformationOptimizer;
pub use neuromorphic_adaptation::AdvancedNeuromorphicMetrics;
pub use neuromorphic_adaptation::AdvancedNeuromorphicProcessor;
pub use neuromorphic_adaptation::NeuromorphicAdaptationNetwork;
pub use neuromorphic_adaptation::NeuromorphicMemorySystem;
pub use neuromorphic_adaptation::NeuromorphicTransformationSystem;
pub use neuromorphic_adaptation::SpikingNeuron;
pub use neuromorphic_adaptation::SystemState;
pub use neuromorphic_adaptation::TransformationEpisode;

Modulesยง

auto_feature_engineering
Automated feature engineering with meta-learning Automated feature engineering with meta-learning
decomposition
Matrix decomposition techniques Matrix decomposition techniques
encoding
Categorical data encoding utilities Categorical data encoding utilities
error
Error handling for the transformation module Error types for the data transformation module
features
Feature engineering techniques Feature engineering utilities
graph
Graph embedding transformers Graph embedding transformers for graph-based feature extraction
image
Image processing transformers Image processing transformers for feature extraction
impute
Missing value imputation utilities Missing value imputation utilities
neuromorphic_adaptation
Neuromorphic computing integration for real-time adaptation Neuromorphic computing integration for real-time transformation adaptation
normalize
Basic normalization methods for data Data normalization and standardization utilities
optimization_config
Optimization configuration and auto-tuning system Optimization configuration and auto-tuning system
out_of_core
Out-of-core processing for large datasets Out-of-core processing for large datasets
performance
Performance optimizations and enhanced implementations Performance optimizations and enhanced implementations
pipeline
Pipeline API for chaining transformations Pipeline API for chaining transformations
quantum_optimization
Quantum-inspired optimization for data transformations Quantum-inspired optimization for data transformations
reduction
Dimensionality reduction algorithms Dimensionality reduction techniques
scaling
Advanced scaling and transformation methods Advanced scaling and transformation methods
selection
Feature selection utilities Feature selection utilities
streaming
Streaming transformations for continuous data Streaming transformations for continuous data processing
text
Text processing transformers Text processing transformers for feature extraction
time_series
Time series feature extraction Time series feature extraction
utils
Utility functions and helpers for data transformation Utility functions and helpers for data transformation