ferrolearn-tree
Decision tree and ensemble tree models for the ferrolearn machine learning framework. Validated against scikit-learn 1.8.0 head-to-head — see the workspace BENCHMARKS.md for the full report.
Algorithms
| Model | Description |
|---|---|
DecisionTreeClassifier / DecisionTreeRegressor |
CART trees with Gini / entropy / MSE / MAE splitting |
ExtraTreeClassifier / ExtraTreeRegressor |
Extremely Randomized single trees |
RandomForestClassifier / RandomForestRegressor |
Bagging ensemble with per-split random feature sampling (Breiman 2001), parallel via Rayon |
ExtraTreesClassifier / ExtraTreesRegressor |
Bagging ensemble of Extra Trees |
GradientBoostingClassifier / GradientBoostingRegressor |
Sequential gradient boosting |
HistGradientBoostingClassifier / HistGradientBoostingRegressor |
Histogram-based gradient boosting (256-bin) |
AdaBoostClassifier / AdaBoostRegressor |
Adaptive Boosting (default algorithm = SAMME to match sklearn ≥ 1.4) |
BaggingClassifier / BaggingRegressor |
Generic bagging meta-estimator |
VotingClassifier / VotingRegressor |
Hard / soft voting ensembles |
IsolationForest |
Outlier / anomaly detection |
RandomTreesEmbedding |
Tree-based feature transformation |
Example
use ;
use ;
use ;
let x = from_shape_vec.unwrap;
let y = array!;
let model = new
.with_n_estimators
.with_max_features;
let fitted = model.fit.unwrap;
let predictions = fitted.predict.unwrap;
All tree hyperparameters (max_depth, min_samples_split,
min_samples_leaf, max_features, criterion, random_state, …) are
configurable via builder methods.
sklearn parity highlights (0.3.0)
RandomForest{Classifier,Regressor}were fixed to do per-split feature sampling (Breiman 2001) instead of a fixed per-tree subset — closed a -16pp accuracy gap at medium scale.AdaBoostClassifierdefault changed fromSAMME.RtoSAMMEto match scikit-learn ≥ 1.4 (which deprecatedSAMME.Rin 1.6) — closed a -19pp gap at small scale.
License
Licensed under either of Apache License, Version 2.0 or MIT License at your option.