Expand description
Core utilities for working with datasets
This module provides the Dataset struct and helper functions for manipulating and transforming datasets. Utility functions and data structures for datasets
This module provides a comprehensive collection of utilities for dataset manipulation, including data serialization, dataset structures, splitting, sampling, balancing, scaling, feature engineering, and trait extensions.
Re-exports§
pub use dataset::Dataset;
pub use splitting::k_fold_split;
pub use splitting::stratified_k_fold_split;
pub use splitting::time_series_split;
pub use splitting::train_test_split;
pub use splitting::CrossValidationFolds;
pub use sampling::bootstrap_sample;
pub use sampling::importance_sample;
pub use sampling::multiple_bootstrap_samples;
pub use sampling::random_sample;
pub use sampling::stratified_sample;
pub use balancing::create_balanced_dataset;
pub use balancing::generate_synthetic_samples;
pub use balancing::random_oversample;
pub use balancing::random_undersample;
pub use balancing::BalancingStrategy;
pub use scaling::min_max_scale;
pub use scaling::normalize;
pub use scaling::robust_scale;
pub use scaling::StatsExt;
pub use feature_engineering::create_binned_features;
pub use feature_engineering::polynomial_features;
pub use feature_engineering::statistical_features;
pub use feature_engineering::BinningStrategy;
pub use serialization::*;
Modules§
- balancing
- Data balancing utilities for handling imbalanced datasets
- dataset
- Core Dataset structure and basic methods
- extensions
- Trait extensions for ndarray and other types
- feature_
engineering - Feature engineering utilities for creating and transforming features
- sampling
- Data sampling utilities for statistical analysis and machine learning
- scaling
- Data scaling and normalization utilities
- serialization
- Serialization utilities for ndarray types with serde
- splitting
- Data splitting utilities for machine learning workflows