Skip to main content

Crate linfa_ensemble

Crate linfa_ensemble 

Source
Expand description

§Ensemble Learning Algorithms

Ensemble methods combine the predictions of several base estimators built with a given learning algorithm in order to improve generalizability / robustness over a single estimator.

This crate (linfa-ensemble), provides pure Rust implementations of popular ensemble techniques, such as

§Bootstrap Aggregation (aka Bagging)

A typical example of ensemble method is Bootstrap Aggregation, which combines the predictions of several decision trees (see linfa-trees) trained on different samples subset of the training dataset.

§Random Forest

A special case of Bootstrap Aggregation using decision trees (see linfa-trees) with random feature selection. A typical number of random prediction to be selected is $\sqrt{p}$ with $p$ being the number of available features.

§AdaBoost

AdaBoost (Adaptive Boosting) is a boosting ensemble method that trains weak learners sequentially. Each subsequent learner focuses on the examples that previous learners misclassified by increasing their sample weights. The final prediction is a weighted vote of all learners, where better-performing learners receive higher weights. Unlike bagging methods, boosting creates a strong classifier from weak learners (typically shallow decision trees or “stumps”).

§Reference

§Example

This example shows how to train a bagging model using 100 decision trees, each trained on 70% of the training data (bootstrap sampling).

use linfa::prelude::{Fit, Predict};
use linfa_ensemble::EnsembleLearnerParams;
use linfa_trees::DecisionTree;
use ndarray_rand::rand::SeedableRng;
use rand::rngs::SmallRng;

// Load Iris dataset
let mut rng = SmallRng::seed_from_u64(42);
let (train, test) = linfa_datasets::iris()
    .shuffle(&mut rng)
    .split_with_ratio(0.8);

// Train the model on the iris dataset
let bagging_model = EnsembleLearnerParams::new(DecisionTree::params())
    .ensemble_size(100)        // Number of Decision Tree to fit
    .bootstrap_proportion(0.7) // Select only 70% of the data via bootstrap
    .fit(&train)
    .unwrap();

// Make predictions on the test set
let predictions = bagging_model.predict(&test);

This example shows how to train a Random Forest model using 100 decision trees, each trained on 70% of the training data (bootstrap sampling) and using only 30% of the available features.

use linfa::prelude::{Fit, Predict};
use linfa_ensemble::RandomForestParams;
use linfa_trees::DecisionTree;
use ndarray_rand::rand::SeedableRng;
use rand::rngs::SmallRng;

// Load Iris dataset
let mut rng = SmallRng::seed_from_u64(42);
let (train, test) = linfa_datasets::iris()
    .shuffle(&mut rng)
    .split_with_ratio(0.8);

// Train the model on the iris dataset
let random_forest = RandomForestParams::new(DecisionTree::params())
    .ensemble_size(100)        // Number of Decision Tree to fit
    .bootstrap_proportion(0.7) // Select only 70% of the data via bootstrap
    .feature_proportion(0.3)   // Select only 30% of the feature
    .fit(&train)
    .unwrap();

// Make predictions on the test set
let predictions = random_forest.predict(&test);

Structs§

AdaBoost
A fitted AdaBoost ensemble classifier.
AdaBoostParams
A helper struct for building AdaBoost hyperparameters.
AdaBoostValidParams
The set of valid hyperparameters for the AdaBoost algorithm.
EnsembleLearner
A fitted ensemble of learners for classification.
EnsembleLearnerParams
A helper struct for building a set of Ensemble Learner hyper-parameters.
EnsembleLearnerValidParams
The set of valid hyper-parameters that can be specified for the fitting procedure of the Ensemble Learner.

Type Aliases§

RandomForest
A fitted ensemble of Decision Trees trained on a random subset of features.
RandomForestParams
A helper struct for building a set of Random Forest hyper-parameters.