pub struct SGBTConfig {Show 34 fields
pub n_steps: usize,
pub learning_rate: f64,
pub feature_subsample_rate: f64,
pub max_depth: usize,
pub n_bins: usize,
pub lambda: f64,
pub gamma: f64,
pub grace_period: usize,
pub delta: f64,
pub drift_detector: DriftDetectorType,
pub variant: SGBTVariant,
pub seed: u64,
pub initial_target_count: usize,
pub leaf_half_life: Option<usize>,
pub max_tree_samples: Option<u64>,
pub split_reeval_interval: Option<usize>,
pub feature_names: Option<Vec<String>>,
pub feature_types: Option<Vec<FeatureType>>,
pub gradient_clip_sigma: Option<f64>,
pub monotone_constraints: Option<Vec<i8>>,
pub quality_prune_alpha: Option<f64>,
pub quality_prune_threshold: f64,
pub quality_prune_patience: u64,
pub error_weight_alpha: Option<f64>,
pub uncertainty_modulated_lr: bool,
pub scale_mode: ScaleMode,
pub empirical_sigma_alpha: f64,
pub max_leaf_output: Option<f64>,
pub adaptive_leaf_bound: Option<f64>,
pub min_hessian_sum: Option<f64>,
pub huber_k: Option<f64>,
pub shadow_warmup: Option<usize>,
pub leaf_model_type: LeafModelType,
pub packed_refresh_interval: u64,
}alloc only.Expand description
Configuration for the SGBT ensemble.
All numeric parameters are validated at build time via SGBTConfigBuilder.
§Defaults
| Parameter | Default |
|---|---|
n_steps | 100 |
learning_rate | 0.0125 |
feature_subsample_rate | 0.75 |
max_depth | 6 |
n_bins | 64 |
lambda | 1.0 |
gamma | 0.0 |
grace_period | 200 |
delta | 1e-7 |
drift_detector | PageHinkley(0.005, 50.0) |
variant | Standard |
seed | 0xDEAD_BEEF_CAFE_4242 |
initial_target_count | 50 |
leaf_half_life | None (disabled) |
max_tree_samples | None (disabled) |
split_reeval_interval | None (disabled) |
Fields§
§n_steps: usizeNumber of boosting steps (trees in the ensemble). Default 100.
learning_rate: f64Learning rate (shrinkage). Default 0.0125.
feature_subsample_rate: f64Fraction of features to subsample per tree. Default 0.75.
max_depth: usizeMaximum tree depth. Default 6.
n_bins: usizeNumber of histogram bins. Default 64.
lambda: f64L2 regularization parameter (lambda). Default 1.0.
gamma: f64Minimum split gain (gamma). Default 0.0.
grace_period: usizeGrace period: min samples before evaluating splits. Default 200.
delta: f64Hoeffding bound confidence (delta). Default 1e-7.
drift_detector: DriftDetectorTypeDrift detector type for tree replacement. Default: PageHinkley.
variant: SGBTVariantSGBT computational variant. Default: Standard.
seed: u64Random seed for deterministic reproducibility. Default: 0xDEAD_BEEF_CAFE_4242.
Controls feature subsampling and variant skip/MI stochastic decisions. Two models with the same seed and same data will produce identical results.
initial_target_count: usizeNumber of initial targets to collect before computing the base prediction. Default: 50.
leaf_half_life: Option<usize>Half-life for exponential leaf decay (in samples per leaf).
After leaf_half_life samples, a leaf’s accumulated gradient/hessian
statistics have half the weight of the most recent sample. This causes
the model to continuously adapt to changing data distributions rather
than freezing on early observations.
None (default) disables decay – traditional monotonic accumulation.
max_tree_samples: Option<u64>Maximum samples a single tree processes before proactive replacement.
After this many samples, the tree is replaced with a fresh one regardless of drift detector state. Prevents stale tree structure from persisting when the drift detector is not sensitive enough.
None (default) disables time-based replacement.
split_reeval_interval: Option<usize>Interval (in samples per leaf) at which max-depth leaves re-evaluate whether a split would improve them.
Inspired by EFDT (Manapragada et al. 2018). When a leaf has accumulated
split_reeval_interval samples since its last evaluation and has reached
max depth, it re-evaluates whether a split should be performed.
None (default) disables re-evaluation – max-depth leaves are permanent.
feature_names: Option<Vec<String>>Optional human-readable feature names.
When set, enables named_feature_importances and
train_one_named for production-friendly named access.
Length must match the number of features in training data.
feature_types: Option<Vec<FeatureType>>Optional per-feature type declarations.
When set, declares which features are categorical vs continuous. Categorical features use one-bin-per-category binning and Fisher optimal binary partitioning for split evaluation. Length must match the number of features in training data.
None (default) treats all features as continuous.
gradient_clip_sigma: Option<f64>Gradient clipping threshold in standard deviations per leaf.
When enabled, each leaf tracks an EWMA of gradient mean and variance.
Incoming gradients that exceed mean ± sigma * gradient_clip_sigma are
clamped to the boundary. This prevents outlier samples from corrupting
leaf statistics, which is critical in streaming settings where sudden
label floods can destabilize the model.
Typical value: 3.0 (3-sigma clipping).
None (default) disables gradient clipping.
monotone_constraints: Option<Vec<i8>>Per-feature monotonic constraints.
Each element specifies the monotonic relationship between a feature and the prediction:
+1: prediction must be non-decreasing as feature value increases.-1: prediction must be non-increasing as feature value increases.0: no constraint (unconstrained).
During split evaluation, candidate splits that would violate monotonicity (left child value > right child value for +1 constraints, or vice versa) are rejected.
Length must match the number of features in training data.
None (default) means no monotonic constraints.
quality_prune_alpha: Option<f64>EWMA smoothing factor for quality-based tree pruning.
When Some(alpha), each boosting step tracks an exponentially weighted
moving average of its marginal contribution to the ensemble. Trees whose
contribution drops below quality_prune_threshold
for quality_prune_patience consecutive
samples are replaced with a fresh tree that can learn the current regime.
This prevents “dead wood” – trees from a past regime that no longer contribute meaningfully to ensemble accuracy.
None (default) disables quality-based pruning.
Suggested value: 0.01.
quality_prune_threshold: f64Minimum contribution threshold for quality-based pruning.
A tree’s EWMA contribution must stay above this value to avoid being
flagged as dead wood. Only used when quality_prune_alpha is Some.
Default: 1e-6.
quality_prune_patience: u64Consecutive low-contribution samples before a tree is replaced.
After this many consecutive samples where a tree’s EWMA contribution
is below quality_prune_threshold, the tree is reset. Only used when
quality_prune_alpha is Some.
Default: 500.
error_weight_alpha: Option<f64>EWMA smoothing factor for error-weighted sample importance.
When Some(alpha), samples the model predicted poorly get higher
effective weight during histogram accumulation. The weight is:
1.0 + |error| / (rolling_mean_error + epsilon), capped at 10x.
This is a streaming version of AdaBoost’s reweighting applied at the gradient level – learning capacity focuses on hard/novel patterns, enabling faster adaptation to regime changes.
None (default) disables error weighting.
Suggested value: 0.01.
uncertainty_modulated_lr: boolEnable σ-modulated learning rate for DistributionalSGBT.
When true, the location (μ) ensemble’s learning rate is scaled by
sigma_ratio = current_sigma / rolling_sigma_mean, where rolling_sigma_mean
is an EWMA of the model’s predicted σ (alpha = 0.001).
This means the model learns μ faster when σ is elevated (high uncertainty) and slower when σ is low (confident regime). The scale (σ) ensemble always trains at the unmodulated base rate to prevent positive feedback loops.
Default: false.
scale_mode: ScaleModeHow the scale (σ) is estimated in DistributionalSGBT.
Empirical(default): EWMA of squared prediction errors.σ = sqrt(ewma_sq_err). Always calibrated, zero tuning, O(1).TreeChain: full dual-chain NGBoost with a separate tree ensemble predicting log(σ) from features.
For σ-modulated learning (uncertainty_modulated_lr = true), Empirical
is strongly recommended — scale tree gradients are inherently weak and
the trees often fail to split.
empirical_sigma_alpha: f64EWMA smoothing factor for empirical σ estimation.
Controls the adaptation speed of σ = sqrt(ewma_sq_err) when
scale_mode is Empirical.
Higher values react faster to regime changes but are noisier.
Default: 0.01 (~100-sample effective window).
max_leaf_output: Option<f64>Maximum absolute leaf output value.
When Some(max), leaf predictions are clamped to [-max, max].
Prevents runaway leaf weights from causing prediction explosions
in feedback loops. None (default) means no clamping.
adaptive_leaf_bound: Option<f64>Per-leaf adaptive output bound (sigma multiplier).
When Some(k), each leaf tracks an EWMA of its own output weight and
clamps predictions to |output_mean| + k * output_std. The EWMA uses
leaf_decay_alpha when leaf_half_life is set, otherwise Welford online.
This is strictly superior to max_leaf_output for streaming — the bound
is per-leaf, self-calibrating, and regime-synchronized. A leaf that usually
outputs 0.3 can’t suddenly output 2.9 just because it fits in the global clamp.
Typical value: 3.0 (3-sigma bound).
None (default) disables adaptive bounds (falls back to max_leaf_output).
min_hessian_sum: Option<f64>Minimum hessian sum before a leaf produces non-zero output.
When Some(min_h), leaves with hess_sum < min_h return 0.0.
Prevents post-replacement spikes from fresh leaves with insufficient
samples. None (default) means all leaves contribute immediately.
huber_k: Option<f64>Huber loss delta multiplier for DistributionalSGBT.
When Some(k), the distributional location gradient uses Huber loss
with adaptive delta = k * empirical_sigma. This bounds gradients by
construction. Standard value: 1.345 (95% efficiency at Gaussian).
None (default) uses squared loss.
shadow_warmup: Option<usize>Shadow warmup for graduated tree handoff.
When Some(n), an always-on shadow (alternate) tree is spawned immediately
alongside every active tree. The shadow trains on the same gradient stream
but does not contribute to predictions until it has seen n samples.
As the active tree ages past 80% of max_tree_samples, its prediction
weight linearly decays to 0 at 120%. The shadow’s weight ramps from 0 to 1
over n samples after warmup. When the active weight reaches 0, the shadow
is promoted and a new shadow is spawned — no cold-start prediction dip.
Requires max_tree_samples to be set for time-based graduated handoff.
Drift-based replacement still uses hard swap (shadow is already warm).
None (default) disables graduated handoff — uses traditional hard swap.
leaf_model_type: LeafModelTypeLeaf prediction model type.
Controls how each leaf computes its prediction:
ClosedForm(default): constant leaf weight.Linear: per-leaf online ridge regression with AdaGrad optimization. Optionaldecayfor concept drift. Recommended for low-depth trees (depth 2–4).MLP: per-leaf single-hidden-layer neural network. Optionaldecayfor concept drift.Adaptive: starts as closed-form, auto-promotes when the Hoeffding bound confirms a more complex model is better.
Default: ClosedForm.
packed_refresh_interval: u64Packed cache refresh interval for DistributionalSGBT.
When non-zero, the distributional model maintains a packed f32 cache of
its location ensemble that is re-exported every packed_refresh_interval
training samples. Predictions use the cache for O(1)-per-tree inference
via contiguous memory traversal, falling back to full tree traversal when
the cache is absent or produces non-finite results.
0 (default) disables the packed cache.
Implementations§
Source§impl SGBTConfig
impl SGBTConfig
Sourcepub fn builder() -> SGBTConfigBuilder
pub fn builder() -> SGBTConfigBuilder
Start building a configuration via the builder pattern.
Trait Implementations§
Source§impl Clone for SGBTConfig
impl Clone for SGBTConfig
Source§fn clone(&self) -> SGBTConfig
fn clone(&self) -> SGBTConfig
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more