Skip to main content

HoeffdingTree

Struct HoeffdingTree 

Source
pub struct HoeffdingTree { /* private fields */ }
Available on crate feature alloc only.
Expand description

A streaming decision tree that uses Hoeffding-bound split decisions.

The tree grows incrementally: each call to train_one routes one sample to its leaf, updates histograms, and potentially triggers a split when statistical evidence is sufficient.

§Feature subsampling

When config.feature_subsample_rate < 1.0, each split evaluation considers only a random subset of features (selected via a deterministic xorshift64 RNG). This adds diversity when the tree is used inside an ensemble.

Implementations§

Source§

impl HoeffdingTree

Source

pub fn new(config: TreeConfig) -> Self

Create a new HoeffdingTree with the given configuration.

The tree starts with a single root leaf and no feature information; the number of features is inferred from the first training sample.

Source

pub fn from_arena( config: TreeConfig, arena: TreeArena, n_features: Option<usize>, samples_seen: u64, rng_state: u64, ) -> Self

Reconstruct a HoeffdingTree from a pre-built arena.

Used during model deserialization. The tree is restored with node topology and leaf values intact, but histogram accumulators are empty (they will rebuild naturally from continued training).

The root is assumed to be NodeId(0). Leaf states are created empty for all current leaf nodes in the arena.

Source

pub fn root(&self) -> NodeId

Root node identifier.

Source

pub fn arena(&self) -> &TreeArena

Immutable access to the underlying arena.

Source

pub fn tree_config(&self) -> &TreeConfig

Immutable access to the tree configuration.

Source

pub fn n_features(&self) -> Option<usize>

Number of features (learned from the first sample, None before any training).

Source

pub fn rng_state(&self) -> u64

Current RNG state (for deterministic checkpoint/restore).

Source

pub fn leaf_grad_hess(&self, node: NodeId) -> Option<(f64, f64)>

Read-only access to the gradient and hessian sums for a leaf node.

Returns Some((grad_sum, hess_sum)) if node is a leaf with an active leaf state, or None if the node has no state (e.g. internal node or freshly allocated).

These sums enable inverse-hessian confidence estimation: confidence = 1.0 / (hess_sum + lambda). High hessian means the leaf has seen consistent, informative data; low hessian means uncertainty.

Source

pub fn predict_smooth(&self, features: &[f64], bandwidth: f64) -> f64

Predict using sigmoid-blended soft routing for smooth interpolation.

Instead of hard left/right routing at each split node, uses sigmoid blending: alpha = sigmoid((threshold - feature) / bandwidth). The prediction is alpha * left_pred + (1 - alpha) * right_pred, computed recursively from root to leaves.

The result is a continuous function that varies smoothly with every feature change — no bins, no boundaries, no jumps.

§Arguments
  • features - Input feature vector.
  • bandwidth - Controls transition sharpness. Smaller = sharper (closer to hard splits), larger = smoother.
Source

pub fn predict_smooth_auto(&self, features: &[f64], bandwidths: &[f64]) -> f64

Predict using per-feature auto-calibrated bandwidths.

Each feature uses its own bandwidth derived from median split threshold gaps. Features with f64::INFINITY bandwidth fall back to hard routing.

Source

pub fn predict_interpolated(&self, features: &[f64]) -> f64

Predict with parent-leaf linear interpolation.

Routes to the leaf but blends the leaf prediction with the parent node’s preserved prediction based on the leaf’s hessian sum. Fresh leaves (low hess_sum) smoothly transition from parent prediction to their own:

alpha = leaf_hess / (leaf_hess + lambda) pred = alpha * leaf_pred + (1 - alpha) * parent_pred

This fixes static predictions from leaves that split but haven’t accumulated enough samples to outperform their parent.

Source

pub fn predict_sibling_interpolated( &self, features: &[f64], bandwidths: &[f64], ) -> f64

Predict with sibling-based interpolation for feature-continuous predictions.

At the leaf’s parent split, blends the leaf prediction with its sibling’s prediction based on the feature’s distance from the split threshold:

Within the margin m around the threshold: t = (feature - threshold + m) / (2 * m) (0 at left edge, 1 at right edge) pred = (1 - t) * left_pred + t * right_pred

Outside the margin, returns the routed child’s prediction directly. The margin m is derived from auto-bandwidths if available, otherwise defaults to feature_range / n_bins heuristic per feature.

This makes predictions vary continuously as features move near split boundaries, eliminating step-function artifacts.

Source

pub fn collect_split_thresholds_per_feature(&self) -> Vec<Vec<f64>>

Collect all split thresholds per feature from the tree arena.

Returns a Vec<Vec<f64>> indexed by feature, containing all thresholds used in continuous splits. Categorical splits are excluded.

Trait Implementations§

Source§

impl Clone for HoeffdingTree

Source§

fn clone(&self) -> Self

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl StreamingTree for HoeffdingTree

Source§

fn train_one(&mut self, features: &[f64], gradient: f64, hessian: f64)

Train the tree on a single sample.

Routes the sample to its leaf, updates histogram accumulators, and attempts a split if the Hoeffding bound is satisfied.

Source§

fn predict(&self, features: &[f64]) -> f64

Predict the leaf value for a feature vector.

Routes from the root to a leaf via threshold comparisons and returns the leaf’s current weight.

Source§

fn n_leaves(&self) -> usize

Current number of leaf nodes.

Source§

fn n_samples_seen(&self) -> u64

Total number of samples seen since creation.

Source§

fn reset(&mut self)

Reset to initial state with a single root leaf.

Source§

fn split_gains(&self) -> &[f64]

Accumulated split gains per feature for importance tracking. Read more
Source§

fn predict_with_variance(&self, features: &[f64]) -> (f64, f64)

Predict the leaf value and its variance for confidence estimation. Read more
Source§

impl Send for HoeffdingTree

Source§

impl Sync for HoeffdingTree

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.