Available on crate feature
alloc only.Expand description
Hoeffding Tree Classifier for streaming multi-class classification.
HoeffdingTreeClassifier is a streaming decision tree that grows
incrementally using the VFDT algorithm (Domingos & Hulten, 2000). Unlike
the existing super::hoeffding::HoeffdingTree (which is a gradient-based
regressor), this classifier maintains per-leaf class distributions and
splits using information gain.
§Algorithm
For each incoming (features, class_label) pair:
- Route the sample from root to a leaf via threshold comparisons.
- At the leaf, update class counts and per-feature histogram bins.
- Once enough samples arrive (grace period), evaluate candidate splits using information gain (entropy reduction).
- Apply the Hoeffding bound: if the gap between the best and second-best
gain exceeds
epsilon = sqrt(R^2 * ln(1/delta) / (2n)), commit the split. - Split the leaf into two children, partitioning class counts by the chosen threshold.
§Key differences from the regressor
- Leaves store class counts, not gradient/hessian sums.
- Split criterion: Information Gain (reduction in entropy).
- Prediction: majority class or class probability distribution.
- No loss function – direct classification.
§Examples
ⓘ
use irithyll::tree::hoeffding_classifier::HoeffdingClassifierConfig;
use irithyll::tree::hoeffding_classifier::HoeffdingTreeClassifier;
let config = HoeffdingClassifierConfig::builder()
.max_depth(6)
.delta(1e-5)
.grace_period(50)
.n_bins(16)
.build()
.unwrap();
let mut tree = HoeffdingTreeClassifier::new(config);
// Train: class 0 when x[0] < 5, class 1 otherwise
for i in 0..200 {
let x = (i as f64) / 20.0;
let class = if x < 5.0 { 0 } else { 1 };
tree.train_one(&[x], class);
}
// After training, the tree can predict class labels.
let pred = tree.predict_class(&[2.0]);
assert!(pred == 0 || pred == 1);Structs§
- Hoeffding
Classifier Config - Configuration for
HoeffdingTreeClassifier. - Hoeffding
Classifier Config Builder - Builder for
HoeffdingClassifierConfig. - Hoeffding
Tree Classifier - A streaming decision tree classifier based on the VFDT algorithm.