Expand description
§Ensemble Learning Algorithms
Ensemble methods combine the predictions of several base estimators built with a given learning algorithm in order to improve generalizability / robustness over a single estimator.
This crate (linfa-ensemble), provides pure Rust implementations of popular ensemble techniques, such as
§Bootstrap Aggregation (aka Bagging)
A typical example of ensemble method is Bootstrap Aggregation, which combines the predictions of
several decision trees (see linfa-trees) trained on different samples subset of the training dataset.
§Random Forest
A special case of Bootstrap Aggregation using decision trees (see linfa-trees) with random feature
selection. A typical number of random prediction to be selected is $\sqrt{p}$ with $p$ being
the number of available features.
§AdaBoost
AdaBoost (Adaptive Boosting) is a boosting ensemble method that trains weak learners sequentially. Each subsequent learner focuses on the examples that previous learners misclassified by increasing their sample weights. The final prediction is a weighted vote of all learners, where better-performing learners receive higher weights. Unlike bagging methods, boosting creates a strong classifier from weak learners (typically shallow decision trees or “stumps”).
§Reference
§Example
This example shows how to train a bagging model using 100 decision trees, each trained on 70% of the training data (bootstrap sampling).
use linfa::prelude::{Fit, Predict};
use linfa_ensemble::EnsembleLearnerParams;
use linfa_trees::DecisionTree;
use ndarray_rand::rand::SeedableRng;
use rand::rngs::SmallRng;
// Load Iris dataset
let mut rng = SmallRng::seed_from_u64(42);
let (train, test) = linfa_datasets::iris()
.shuffle(&mut rng)
.split_with_ratio(0.8);
// Train the model on the iris dataset
let bagging_model = EnsembleLearnerParams::new(DecisionTree::params())
.ensemble_size(100) // Number of Decision Tree to fit
.bootstrap_proportion(0.7) // Select only 70% of the data via bootstrap
.fit(&train)
.unwrap();
// Make predictions on the test set
let predictions = bagging_model.predict(&test);This example shows how to train a Random Forest model using 100 decision trees, each trained on 70% of the training data (bootstrap sampling) and using only 30% of the available features.
use linfa::prelude::{Fit, Predict};
use linfa_ensemble::RandomForestParams;
use linfa_trees::DecisionTree;
use ndarray_rand::rand::SeedableRng;
use rand::rngs::SmallRng;
// Load Iris dataset
let mut rng = SmallRng::seed_from_u64(42);
let (train, test) = linfa_datasets::iris()
.shuffle(&mut rng)
.split_with_ratio(0.8);
// Train the model on the iris dataset
let random_forest = RandomForestParams::new(DecisionTree::params())
.ensemble_size(100) // Number of Decision Tree to fit
.bootstrap_proportion(0.7) // Select only 70% of the data via bootstrap
.feature_proportion(0.3) // Select only 30% of the feature
.fit(&train)
.unwrap();
// Make predictions on the test set
let predictions = random_forest.predict(&test);Structs§
- AdaBoost
- A fitted AdaBoost ensemble classifier.
- AdaBoost
Params - A helper struct for building AdaBoost hyperparameters.
- AdaBoost
Valid Params - The set of valid hyperparameters for the AdaBoost algorithm.
- Ensemble
Learner - A fitted ensemble of learners for classification.
- Ensemble
Learner Params - A helper struct for building a set of Ensemble Learner hyper-parameters.
- Ensemble
Learner Valid Params - The set of valid hyper-parameters that can be specified for the fitting procedure of the Ensemble Learner.
Type Aliases§
- Random
Forest - A fitted ensemble of Decision Trees trained on a random subset of features.
- Random
Forest Params - A helper struct for building a set of Random Forest hyper-parameters.