pub struct SGBTConfigBuilder { /* private fields */ }alloc only.Expand description
Builder for SGBTConfig with validation on build().
§Example
use irithyll::ensemble::config::{SGBTConfig, DriftDetectorType};
use irithyll::ensemble::variants::SGBTVariant;
let config = SGBTConfig::builder()
.n_steps(200)
.learning_rate(0.05)
.drift_detector(DriftDetectorType::Adwin { delta: 0.01 })
.variant(SGBTVariant::Skip { k: 10 })
.build()
.expect("valid config");Implementations§
Source§impl SGBTConfigBuilder
impl SGBTConfigBuilder
Sourcepub fn n_steps(self, n: usize) -> Self
pub fn n_steps(self, n: usize) -> Self
Set the number of boosting steps (trees in the ensemble).
Sourcepub fn learning_rate(self, lr: f64) -> Self
pub fn learning_rate(self, lr: f64) -> Self
Set the learning rate (shrinkage factor).
Sourcepub fn feature_subsample_rate(self, rate: f64) -> Self
pub fn feature_subsample_rate(self, rate: f64) -> Self
Set the fraction of features to subsample per tree.
Sourcepub fn grace_period(self, gp: usize) -> Self
pub fn grace_period(self, gp: usize) -> Self
Set the grace period (minimum samples before evaluating splits).
Sourcepub fn drift_detector(self, dt: DriftDetectorType) -> Self
pub fn drift_detector(self, dt: DriftDetectorType) -> Self
Set the drift detector type for tree replacement.
Sourcepub fn variant(self, v: SGBTVariant) -> Self
pub fn variant(self, v: SGBTVariant) -> Self
Set the SGBT computational variant.
Sourcepub fn seed(self, seed: u64) -> Self
pub fn seed(self, seed: u64) -> Self
Set the random seed for deterministic reproducibility.
Controls feature subsampling and variant skip/MI stochastic decisions. Two models with the same seed and data sequence will produce identical results.
Sourcepub fn initial_target_count(self, count: usize) -> Self
pub fn initial_target_count(self, count: usize) -> Self
Set the number of initial targets to collect before computing the base prediction.
The model collects this many target values before initializing the base
prediction (via loss.initial_prediction). Default: 50.
Sourcepub fn leaf_half_life(self, n: usize) -> Self
pub fn leaf_half_life(self, n: usize) -> Self
Set the half-life for exponential leaf decay (in samples per leaf).
After n samples, a leaf’s accumulated statistics have half the weight
of the most recent sample. Enables continuous adaptation to concept drift.
Sourcepub fn max_tree_samples(self, n: u64) -> Self
pub fn max_tree_samples(self, n: u64) -> Self
Set the maximum samples a single tree processes before proactive replacement.
After n samples, the tree is replaced regardless of drift detector state.
Sourcepub fn split_reeval_interval(self, n: usize) -> Self
pub fn split_reeval_interval(self, n: usize) -> Self
Set the split re-evaluation interval for max-depth leaves.
Every n samples per leaf, max-depth leaves re-evaluate whether a split
would improve them. Inspired by EFDT (Manapragada et al. 2018).
Sourcepub fn feature_names(self, names: Vec<String>) -> Self
pub fn feature_names(self, names: Vec<String>) -> Self
Set human-readable feature names.
Enables named feature importances and named training input.
Names must be unique; validated at build().
Sourcepub fn feature_types(self, types: Vec<FeatureType>) -> Self
pub fn feature_types(self, types: Vec<FeatureType>) -> Self
Set per-feature type declarations.
Declares which features are categorical vs continuous. Categorical features use one-bin-per-category binning and Fisher optimal binary partitioning. Supports up to 64 distinct category values per categorical feature.
Sourcepub fn gradient_clip_sigma(self, sigma: f64) -> Self
pub fn gradient_clip_sigma(self, sigma: f64) -> Self
Set per-leaf gradient clipping threshold (in standard deviations).
Each leaf tracks an EWMA of gradient mean and variance. Gradients
exceeding mean ± sigma * n are clamped. Prevents outlier labels
from corrupting streaming model stability.
Typical value: 3.0 (3-sigma clipping).
Sourcepub fn monotone_constraints(self, constraints: Vec<i8>) -> Self
pub fn monotone_constraints(self, constraints: Vec<i8>) -> Self
Set per-feature monotonic constraints.
+1 = non-decreasing, -1 = non-increasing, 0 = unconstrained.
Candidate splits violating monotonicity are rejected during tree growth.
Sourcepub fn quality_prune_alpha(self, alpha: f64) -> Self
pub fn quality_prune_alpha(self, alpha: f64) -> Self
Enable quality-based tree pruning with the given EWMA smoothing factor.
Trees whose marginal contribution drops below the threshold for
patience consecutive samples are replaced with fresh trees.
Suggested alpha: 0.01.
Sourcepub fn quality_prune_threshold(self, threshold: f64) -> Self
pub fn quality_prune_threshold(self, threshold: f64) -> Self
Set the minimum contribution threshold for quality-based pruning.
Default: 1e-6. Only relevant when quality_prune_alpha is set.
Sourcepub fn quality_prune_patience(self, patience: u64) -> Self
pub fn quality_prune_patience(self, patience: u64) -> Self
Set the patience (consecutive low-contribution samples) before pruning.
Default: 500. Only relevant when quality_prune_alpha is set.
Sourcepub fn error_weight_alpha(self, alpha: f64) -> Self
pub fn error_weight_alpha(self, alpha: f64) -> Self
Enable error-weighted sample importance with the given EWMA smoothing factor.
Samples the model predicted poorly get higher effective weight. Suggested alpha: 0.01.
Sourcepub fn uncertainty_modulated_lr(self, enabled: bool) -> Self
pub fn uncertainty_modulated_lr(self, enabled: bool) -> Self
Enable σ-modulated learning rate for distributional models.
Scales the location (μ) learning rate by current_sigma / rolling_sigma_mean,
so the model adapts faster during high-uncertainty regimes and conserves
during stable periods. Only affects DistributionalSGBT.
By default uses empirical σ (EWMA of squared errors). Set
scale_mode(ScaleMode::TreeChain) for feature-conditional σ.
Sourcepub fn scale_mode(self, mode: ScaleMode) -> Self
pub fn scale_mode(self, mode: ScaleMode) -> Self
Set the scale estimation mode for DistributionalSGBT.
Sourcepub fn empirical_sigma_alpha(self, alpha: f64) -> Self
pub fn empirical_sigma_alpha(self, alpha: f64) -> Self
EWMA alpha for empirical σ. Controls adaptation speed. Default 0.01.
Only used when scale_mode is Empirical.
Sourcepub fn max_leaf_output(self, max: f64) -> Self
pub fn max_leaf_output(self, max: f64) -> Self
Set the maximum absolute leaf output value.
Clamps leaf predictions to [-max, max], breaking feedback loops
that cause prediction explosions.
Sourcepub fn adaptive_leaf_bound(self, k: f64) -> Self
pub fn adaptive_leaf_bound(self, k: f64) -> Self
Set per-leaf adaptive output bound (sigma multiplier).
Each leaf tracks EWMA of its own output weight and clamps to
|output_mean| + k * output_std. Self-calibrating per-leaf.
Recommended: use with leaf_half_life for streaming scenarios.
Sourcepub fn min_hessian_sum(self, min_h: f64) -> Self
pub fn min_hessian_sum(self, min_h: f64) -> Self
Set the minimum hessian sum for leaf output.
Fresh leaves with hess_sum < min_h return 0.0, preventing
post-replacement spikes.
Sourcepub fn huber_k(self, k: f64) -> Self
pub fn huber_k(self, k: f64) -> Self
Set the Huber loss delta multiplier for DistributionalSGBT.
When set, location gradients use Huber loss with adaptive
delta = k * empirical_sigma. Standard value: 1.345 (95% Gaussian efficiency).
Sourcepub fn shadow_warmup(self, warmup: usize) -> Self
pub fn shadow_warmup(self, warmup: usize) -> Self
Enable graduated tree handoff with the given shadow warmup samples.
Spawns an always-on shadow tree that trains alongside the active tree.
After warmup samples, the shadow begins contributing to predictions
via graduated blending. Eliminates prediction dips during tree replacement.
Sourcepub fn leaf_model_type(self, lmt: LeafModelType) -> Self
pub fn leaf_model_type(self, lmt: LeafModelType) -> Self
Set the leaf prediction model type.
LeafModelType::Linear is recommended for low-depth configurations
(depth 2–4) where per-leaf linear models reduce approximation error.
LeafModelType::Adaptive automatically selects between closed-form and
a trainable model per leaf, using the Hoeffding bound for promotion.
Sourcepub fn packed_refresh_interval(self, interval: u64) -> Self
pub fn packed_refresh_interval(self, interval: u64) -> Self
Set the packed cache refresh interval for distributional models.
When non-zero, DistributionalSGBT
maintains a packed f32 cache refreshed every interval training samples.
0 (default) disables the cache.
Sourcepub fn build(self) -> Result<SGBTConfig>
pub fn build(self) -> Result<SGBTConfig>
Validate and build the configuration.
§Errors
Returns InvalidConfig with a structured
ConfigError if any parameter is out of its valid range.
Trait Implementations§
Source§impl Clone for SGBTConfigBuilder
impl Clone for SGBTConfigBuilder
Source§fn clone(&self) -> SGBTConfigBuilder
fn clone(&self) -> SGBTConfigBuilder
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more