MiniBoosts is a library for boosting algorithm researchers.
What is Boosting?
Boosting is a repeated game between a Booster and a Weak Learner.
For each round of the game,
- The Booster chooses a distribution over training examples,
- Then the Weak Learner chooses a hypothesis (function) whose accuracy w.r.t. the distribution is slightly better than random guessing.
After sufficient rounds, the Booster outputs a hypothesis that performs significantly better on training examples.
How to use this library
Write the following in your cargo.toml.
[]
= { = "0.3.5" }
All boosting algorithms are implemented without Gurobi. but, if you have a Gurobi license, you can use the Gurobi version of the algorithms by setting:
[]
= { = "0.3.5", = ["gurobi"] }
[!CAUTION] Since I am no longer a student, I cannot check whether the compilation succeeded with the
"gurobi"flag.
Currently, following boosting algorithms are available:
BOOSTER |
FEATURE FLAG |
|---|---|
| AdaBoostby Freund and Schapire, 1997 | |
| MadaBoostby Domingo and Watanabe, 2000 | |
| GBM (Gradient Boosting Machine)by Jerome H. Friedman, 2001 | |
| LPBoostby Demiriz, Bennett, and Shawe-Taylor, 2002 | gurobi |
| SmoothBoostby Servedio, 2003 | |
| AdaBoostVby Rätsch and Warmuth, 2005 | |
| TotalBoostby Warmuth, Liao, and Rätsch, 2006 | gurobi |
| SoftBoostby Warmuth, Glocer, and Rätsch, 2007 | gurobi |
| ERLPBoostby Warmuth and Glocer, and Vishwanathan, 2008 | gurobi |
| CERLPBoost (Corrective ERLPBoost)by Shalev-Shwartz and Singer, 2010 | gurobi |
| MLPBoostby Mitsuboshi, Hatano, and Takimoto, 2022 | gurobi |
| GraphSepBoost (Graph Separation Boosting)by Alon, Gonen, Hazan, and Moran, 2023 |
If you invent a new boosting algorithm,
you can introduce it by implementing Booster trait.
See cargo doc -F gurobi --open for details.
WEAK LEARNER |
|---|
| Decision Tree |
| Regression Tree |
| A worst-case weak learner for LPBoost |
| Gaussian Naive Bayes |
| Neural Network (Experimental) |
Why MiniBoosts?
If you write a paper about boosting algorithms, you need to compare your algorithm against others. At this point, some issues arise.
- Some boosting algorithms, such as LightGBM or XGBoost, are implemented and available for free. These are very easy to use in Python3 but hard to compare to other algorithms since they are implemented in C++ internally. Implementing your algorithm in Python3 makes the running time comparison unfair (Python3 is significantly slow compared to C++). However, implementing it in C++ is extremely hard (based on my experience).
- Most boosting algorithms are designed for a decision-tree weak learner even though the boosting protocol does not demand.
- There is no implementation for margin optimization boosting algorithms. Margin optimization is a better goal than empirical risk minimization in binary classification.
MiniBoosts is a crate to address the above issues.
This crate provides the followings.
- Two main traits, named
BoosterandWeakLearner.- If you invent a new Boosting algorithm,
all you need is to implement
Booster. - If you invent a new Weak Learning algorithm,
all you need is to implement
WeakLearner.
- If you invent a new Boosting algorithm,
all you need is to implement
- Some famous boosting algorithms, including AdaBoost, LPBoost, ERLPBoost, etc.
- Some weak learners, including Decision-Tree, Regression-Tree, etc.
MiniBoosts for reasearch
Sometimes, one wants to log each step of boosting procedure.
You can use Logger struct to output log to .csv file,
while printing the status like this:

See Research feature section for detail.
Examples
Write the following to Cargo.toml.
= { = "0.3.5" }
If you want to use gurobi features, enable the flag:
= { = "0.3.5", = ["gurobi"] }
Here is a sample code:
use *;
If you use boosting for soft margin optimization, initialize booster like this:
let n_sample = sample.shape.0; // Get the number of training examples
let nu = n_sample as f64 * 0.2; // Set the upper-bound of the number of outliers.
let lpboost = init
.tolerance
.nu; // Set a capping parameter.
Note that the capping parameter must satisfies 1 <= nu && nu <= n_sample.
Research feature
This crate can output a CSV file for such values in each step.
Here is an example:
use *;
use ;
// Define a loss function
Others
- Currently, this crate mainly supports boosting algorithms for binary classification.
- Some boosting algorithms use Gurobi optimizer,
so you must acquire a license to use this library.
If you have the license, you can use these boosting algorithms (boosters)
by specifying
features = ["gurobi"]inCargo.toml. The compilation fails if you try to use the gurobi feature without a Gurobi license. - One can log your algorithm by implementing
Researchtrait. - Run
cargo doc -F gurobi --opento see more information. GraphSepBoostonly supports the aggregation rule shown in Lemma 4.2 of their paper.
Future work
-
Boosters
-
Weak Learners
- Bag of words
- TF-IDF
- RBF-Net