pub struct DistributionalSGBT { /* private fields */ }alloc only.Expand description
NGBoost-style distributional streaming gradient boosted trees.
Outputs a full Gaussian predictive distribution N(μ, σ²) by maintaining two independent ensembles – one for location (mean) and one for scale (log-sigma).
§Example
use irithyll::SGBTConfig;
use irithyll::ensemble::distributional::DistributionalSGBT;
let config = SGBTConfig::builder().n_steps(10).build().unwrap();
let mut model = DistributionalSGBT::new(config);
// Train on streaming data
model.train_one(&(vec![1.0, 2.0], 3.5));
// Get full distributional prediction
let pred = model.predict(&[1.0, 2.0]);
println!("mean={}, sigma={}", pred.mu, pred.sigma);Implementations§
Source§impl DistributionalSGBT
impl DistributionalSGBT
Sourcepub fn new(config: SGBTConfig) -> Self
pub fn new(config: SGBTConfig) -> Self
Create a new distributional SGBT with the given configuration.
When scale_mode is Empirical (default), scale trees are still allocated
but never trained — only the EWMA error tracker produces σ. When
scale_mode is TreeChain, both location and scale ensembles are active.
Sourcepub fn train_one(&mut self, sample: &impl Observation)
pub fn train_one(&mut self, sample: &impl Observation)
Train on a single observation.
Sourcepub fn predict(&self, features: &[f64]) -> GaussianPrediction
pub fn predict(&self, features: &[f64]) -> GaussianPrediction
Predict the full Gaussian distribution for a feature vector.
When a packed cache is available, uses it for the location (μ) prediction via contiguous BFS-packed memory traversal. Falls back to full tree traversal if the cache is absent or produces non-finite results.
Sigma computation always uses the primary path (EWMA or scale chain) and is unaffected by the packed cache.
Sourcepub fn predict_smooth(
&self,
features: &[f64],
bandwidth: f64,
) -> GaussianPrediction
pub fn predict_smooth( &self, features: &[f64], bandwidth: f64, ) -> GaussianPrediction
Predict using sigmoid-blended soft routing for smooth interpolation.
Instead of hard left/right routing at tree split nodes, each split
uses sigmoid blending: alpha = sigmoid((threshold - feature) / bandwidth).
The result is a continuous function that varies smoothly with every
feature change.
bandwidth controls transition sharpness: smaller = sharper (closer
to hard splits), larger = smoother.
Sourcepub fn predict_interpolated(&self, features: &[f64]) -> GaussianPrediction
pub fn predict_interpolated(&self, features: &[f64]) -> GaussianPrediction
Predict with parent-leaf linear interpolation.
Blends each leaf prediction with its parent’s preserved prediction based on sample count, preventing stale predictions from fresh leaves.
Sourcepub fn predict_sibling_interpolated(
&self,
features: &[f64],
) -> GaussianPrediction
pub fn predict_sibling_interpolated( &self, features: &[f64], ) -> GaussianPrediction
Predict with sibling-based interpolation for feature-continuous predictions.
At each split node near the threshold boundary, blends left and right subtree predictions linearly. Uses auto-calibrated bandwidths as the interpolation margin. Predictions vary continuously as features change.
Sourcepub fn predict_graduated(&self, features: &[f64]) -> GaussianPrediction
pub fn predict_graduated(&self, features: &[f64]) -> GaussianPrediction
Predict with graduated active-shadow blending.
Smoothly transitions between active and shadow trees during replacement.
Requires shadow_warmup to be configured.
Sourcepub fn predict_graduated_sibling_interpolated(
&self,
features: &[f64],
) -> GaussianPrediction
pub fn predict_graduated_sibling_interpolated( &self, features: &[f64], ) -> GaussianPrediction
Predict with graduated blending + sibling interpolation (premium path).
Sourcepub fn predict_distributional(&self, features: &[f64]) -> (f64, f64, f64)
pub fn predict_distributional(&self, features: &[f64]) -> (f64, f64, f64)
Predict with σ-ratio diagnostic exposed.
Returns (mu, sigma, sigma_ratio) where sigma_ratio is
current_sigma / rolling_sigma_mean – the multiplier applied to the
location learning rate when uncertainty_modulated_lr
is enabled.
When σ-modulation is disabled, sigma_ratio is always 1.0.
Sourcepub fn empirical_sigma(&self) -> f64
pub fn empirical_sigma(&self) -> f64
Current empirical sigma (sqrt(ewma_sq_err)).
Returns the model’s recent error magnitude. Available in both scale modes.
Sourcepub fn scale_mode(&self) -> ScaleMode
pub fn scale_mode(&self) -> ScaleMode
Current scale mode.
Sourcepub fn sigma_velocity(&self) -> f64
pub fn sigma_velocity(&self) -> f64
Current σ velocity – the EWMA-smoothed derivative of empirical σ.
Positive values indicate growing prediction errors (model deteriorating
or regime change). Negative values indicate improving predictions.
Only meaningful when ScaleMode::Empirical is active.
Sourcepub fn predict_mu(&self, features: &[f64]) -> f64
pub fn predict_mu(&self, features: &[f64]) -> f64
Predict the mean (location parameter) only.
Sourcepub fn predict_sigma(&self, features: &[f64]) -> f64
pub fn predict_sigma(&self, features: &[f64]) -> f64
Predict the standard deviation (scale parameter) only.
Sourcepub fn predict_interval(&self, features: &[f64], confidence: f64) -> (f64, f64)
pub fn predict_interval(&self, features: &[f64], confidence: f64) -> (f64, f64)
Predict a symmetric confidence interval.
confidence is the Z-score multiplier:
- 1.0 → 68% CI
- 1.96 → 95% CI
- 2.576 → 99% CI
Sourcepub fn predict_batch(
&self,
feature_matrix: &[Vec<f64>],
) -> Vec<GaussianPrediction>
pub fn predict_batch( &self, feature_matrix: &[Vec<f64>], ) -> Vec<GaussianPrediction>
Batch prediction.
Sourcepub fn train_batch<O: Observation>(&mut self, samples: &[O])
pub fn train_batch<O: Observation>(&mut self, samples: &[O])
Train on a batch of observations.
Sourcepub fn train_batch_with_callback<O: Observation, F: FnMut(usize)>(
&mut self,
samples: &[O],
interval: usize,
callback: F,
)
pub fn train_batch_with_callback<O: Observation, F: FnMut(usize)>( &mut self, samples: &[O], interval: usize, callback: F, )
Train on a batch with periodic callback.
Sourcepub fn ensemble_grad_std(&self) -> f64
pub fn ensemble_grad_std(&self) -> f64
Ensemble-level gradient standard deviation.
Sourcepub fn ensemble_grad_mean(&self) -> f64
pub fn ensemble_grad_mean(&self) -> f64
Ensemble-level gradient mean.
Sourcepub fn enable_packed_cache(&mut self, interval: u64)
pub fn enable_packed_cache(&mut self, interval: u64)
Enable or reconfigure the packed inference cache at runtime.
Sets the refresh interval and immediately builds the initial cache
if the model has been initialized. Pass 0 to disable.
Sourcepub fn has_packed_cache(&self) -> bool
pub fn has_packed_cache(&self) -> bool
Whether the packed inference cache is currently populated.
Sourcepub fn auto_bandwidths(&self) -> &[f64]
pub fn auto_bandwidths(&self) -> &[f64]
Per-feature auto-calibrated bandwidths used by predict().
Sourcepub fn n_samples_seen(&self) -> u64
pub fn n_samples_seen(&self) -> u64
Total samples trained.
Sourcepub fn total_leaves(&self) -> usize
pub fn total_leaves(&self) -> usize
Total leaves across all active trees (location + scale).
Sourcepub fn is_initialized(&self) -> bool
pub fn is_initialized(&self) -> bool
Whether base predictions have been initialized.
Sourcepub fn config(&self) -> &SGBTConfig
pub fn config(&self) -> &SGBTConfig
Access the configuration.
Sourcepub fn location_steps(&self) -> &[BoostingStep]
pub fn location_steps(&self) -> &[BoostingStep]
Access the location boosting steps (for export/inspection).
Sourcepub fn location_base(&self) -> f64
pub fn location_base(&self) -> f64
Base prediction for the location (mean) ensemble.
Sourcepub fn learning_rate(&self) -> f64
pub fn learning_rate(&self) -> f64
Learning rate from the model configuration.
Sourcepub fn rolling_sigma_mean(&self) -> f64
pub fn rolling_sigma_mean(&self) -> f64
Current rolling σ mean (EWMA of predicted σ).
Returns 1.0 if the model hasn’t been initialized yet.
Sourcepub fn is_uncertainty_modulated(&self) -> bool
pub fn is_uncertainty_modulated(&self) -> bool
Whether σ-modulated learning rate is active.
Sourcepub fn diagnostics(&self) -> ModelDiagnostics
pub fn diagnostics(&self) -> ModelDiagnostics
Full model diagnostics: per-tree structure, feature usage, base predictions.
The trees vector contains location trees first (indices 0..n_steps),
then scale trees (n_steps..2*n_steps).
scale_trees_active counts how many scale trees have actually split
(more than 1 leaf). If this is 0, the scale chain is effectively frozen.
Sourcepub fn predict_decomposed(&self, features: &[f64]) -> DecomposedPrediction
pub fn predict_decomposed(&self, features: &[f64]) -> DecomposedPrediction
Per-tree contribution to the final prediction.
Returns two vectors: location contributions and scale contributions.
Each entry is learning_rate * tree_prediction – the additive
contribution of that boosting step to the final μ or log(σ).
Summing location_base + sum(location_contributions) recovers μ.
Summing scale_base + sum(scale_contributions) recovers log(σ).
In Empirical scale mode, scale_base is ln(empirical_sigma) and
scale_contributions are all zero (σ is not tree-derived).
Sourcepub fn feature_importances(&self) -> Vec<f64>
pub fn feature_importances(&self) -> Vec<f64>
Feature importances based on accumulated split gains across all trees.
Aggregates gains from both location and scale ensembles, then normalizes to sum to 1.0. Indexed by feature. Returns an empty Vec if no splits have occurred yet.
Trait Implementations§
Source§impl Clone for DistributionalSGBT
impl Clone for DistributionalSGBT
Source§impl Debug for DistributionalSGBT
impl Debug for DistributionalSGBT
Source§impl StreamingLearner for DistributionalSGBT
impl StreamingLearner for DistributionalSGBT
Source§fn predict(&self, features: &[f64]) -> f64
fn predict(&self, features: &[f64]) -> f64
Returns the mean (μ) of the predicted Gaussian distribution.