Expand description
A crate that provides some boosting algorithms.
All the boosting algorithm in this crate,
except LPBoost, has theoretical iteration bound
until finding a combined hypothesis.
This crate includes three types of boosting algorithms.
-
Empirical risk minimizing (ERM) boosting
-
Hard margin maximizing boosting
-
Soft margin maximizing boosting
This crate also includes some Weak Learners.
- Classification
DecisionTree,NeuralNetwork,GaussianNB,BadBaseLearner(The bad base learner for LPBoost).
- Regression
RegressionTree. Note that the current implement is not efficient.
§Example
The following code shows a small example for running LPBoost.
See also:
use miniboosts::prelude::*;
// Read the training sample from the CSV file.
// We use the column named `class` as the label.
let path = "path/to/dataset.csv";
let sample = SampleReader::new()
.file(path)
.has_header(true)
.target_feature("class")
.read()
.unwrap();
// Get the number of training examples.
let n_sample = data.shape().0 as f64;
// Initialize `LPBoost` and set the tolerance parameter as `0.01`.
// This means `booster` returns a hypothesis whose training error is
// less than `0.01` if the traing examples are linearly separable.
// Note that the default tolerance parameter is set as `1 / n_sample`,
// where `n_sample = data.shape().0` is
// the number of training examples in `data`.
// Further, at the end of this chain,
// LPBoost calls `LPBoost::nu` to set the capping parameter
// as `0.1 * n_sample`, which means that,
// at most, `0.1 * n_sample` examples are regarded as outliers.
let booster = LPBoost::init(&sample)
.tolerance(0.01)
.nu(0.1 * n_sample);
// Set the weak learner with setting parameters.
let weak_learner = DecisionTreeBuilder::new(&sample)
.max_depth(2)
.criterion(Criterion::Entropy)
.build();
// Run `LPBoost` and obtain the resulting hypothesis `f`.
let f = booster.run(&weak_learner);
// Get the predictions on the training set.
let predictions = f.predict_all(&data);
// Calculate the training loss.
let target = sample.target();
let training_loss = target.into_iter()
.zip(predictions)
.map(|(&y, fx)| if y as i64 == fx { 0.0 } else { 1.0 })
.sum::<f64>()
/ n_sample;
println!("Training Loss is: {training_loss}");Re-exports§
pub use research::Logger;pub use research::LoggerBuilder;pub use research::CrossValidation;pub use research::objective_functions::SoftMarginObjective;pub use research::objective_functions::HardMarginObjective;pub use research::objective_functions::ExponentialLoss;
Modules§
- prelude
- Exports the standard boosting algorithms and traits.
- research
- This directory provides some features for research
Measure the followings of boosting algorithm per iteration
Structs§
- AdaBoost
- The AdaBoost algorithm proposed by Robert E. Schapire and Yoav Freund.
- AdaBoostV
- The
AdaBoostValgorithm, proposed by Rätsch and Warmuth.
AdaBoostV, also known asAdaBoost_{ν}^{★}, is a boosting algorithm proposed in the following paper: - BadBase
Learner - The worst-case weak leaerner for
LPBoost. - BadBase
Learner Builder - A struct that builds
BadBaseLearner. - BadClassifier
- A hypothesis returned by
BadBaseLearner.This struct is user for demonstrating the worst-case LPBoost behavior. - CERLP
Boost - The Corrective ERLPBoost algorithm, proposed in the following paper:
- Decision
Tree - The Decision Tree algorithm.
Given a set of training examples for classification and a distribution over the set,DecisionTreeoutputs a decision tree classifier namedDecisionTreeClassifierunder the specified parameters. - Decision
Tree Builder - A struct that builds
DecisionTree.DecisionTreeBuilderkeeps parameters for constructingDecisionTree. - Decision
Tree Classifier - Decision tree classifier.
This struct is just a wrapper of
Node. - ERLP
Boost - The
ERLPBoostalgorithm proposed in the following paper: - GBM
- The Gradient Boosting Machine proposed in the following paper:
- GaussianNB
- A factory that produces a
GaussianNBClassifierfor a given distribution over training examples. The struct name comes from scikit-learn. - Graph
SepBoost - The Graph Separation Boosting algorithm proposed by Robert E. Schapire and Yoav Freund.
- LPBoost
- The
LPBoostalgorithm proposed by Demiriz, Bennett, and Shawe-Taylor.
LPBoostis originally proposed in the following paper: - MLPBoost
- The MLPBoost algorithm, shorthand of Modified LPBoost algorithm, proposed in the following paper:
- Mada
Boost - The MadaBoost algorithm proposed by Carlos Domingo and Osamu Watanabe, 2000.
- NBayes
Classifier - Naive Bayes classifier.
- NNClassifier
- A wrapper for
NNHypothesis. - NNHypothesis
- A neural network hypothesis,
produced by
NeuralNetwork. - NNRegressor
- A wrapper for
NNHypothesis. - Naive
Aggregation - The naive aggregation rule. See the following paper for example:
- Neural
Network - A neural network weak learner. Since this is just a weak learner, a shallow network is preferred. Of course, you can use a deep network if you don’t care about running time.
- Regression
Tree RegressionTreeis the factory that generates aRegressionTreeClassifierfor a given distribution over examples.- Regression
Tree Builder - A struct that builds
RegressionTree.RegressionTreeBuilderkeeps parameters for constructingRegressionTree. - Regression
Tree Regressor - Regression Tree regressor.
This struct is just a wrapper of
Node. - Sample
- Struct
Sampleholds a batch sample with dense/sparse format. - Sample
Reader - A struct that returns
Sample. Using this struct, one can read a CSV/SVMLIGHT format file toSample. Other formats are not supported yet. - Smooth
Boost SmoothBoost. Variable names, such askappa,gamma, andtheta, come from the original paper.
Note thatSmoothBoostneeds to know the weak learner guaranteegamma.
See Figure 1 in this paper: Smooth Boosting and Learning with Malicious Noise by Rocco A. Servedio.- Soft
Boost - The SoftBoost algorithm proposed in the following paper:
- Total
Boost - The TotalBoost algorithm proposed in the following paper: Manfred K. Warmuth, Jun Liao, and Gunnar Rätsch - Totally corrective boosting algorithms that maximize the margin
- Weighted
Majority - A struct that the boosting algorithms in this library return.
You can read/write this struct by
Serdetrait.
Enums§
- Activation
- Activation functions available to neural networks.
- Criterion
- Splitting criteria for growing decision tree.
- FWType
- Some useful functions / traits FWType updates. These options correspond to the Frank-Wolfe strategies.
- Feature
- An enumeration of sparse/dense feature.
- GBMLoss
- Some useful functions / traits Some well-known loss functions.
- NNLoss
- Loss functions available to Neural networks.
Traits§
- Booster
- Booster trait
The trait
Boosterdefines the standard framework of Boosting. Here, the standard framework is defined as a repeated game between Booster and Weak Learner of the following form: - Classifier
- A trait that defines the behavor of classifier.
You only need to implement
confidencemethod. - Loss
Function - Some useful functions / traits This trait defines the loss functions.
- Regressor
- A trait that defines the behavor of regressor.
You only need to implement
predictmethod. - Weak
Learner - An interface that returns a struct of type
Hypothesis.