irithyll 9.8.3

Streaming machine learning in Rust -- gradient boosted trees, kernel methods, linear models, and composable pipelines
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
# Changelog


All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [9.0.0] - 2026-03-22


### Added


- **Reservoir Computing** (`neural::reservoir`):
  - `NextGenRC` — time-delay embedding + polynomial features (degree 2–5) + RLS readout.
    No random matrices needed. Factory: `ngrc(k, s, degree)`.
  - `EchoStateNetwork` — cycle/ring reservoir topology (O(N) weights) + leaky integrator
    + RLS readout. Factory: `esn(n_reservoir, spectral_radius)`.
  - `ESNPreprocessor` — ESN as `StreamingPreprocessor` for pipeline composition.
    Factory: `esn_preprocessor(...)`.
  - `DelayBuffer` circular buffer for NG-RC time-delay embedding.
  - `HighDegreePolynomial` for degree 2–5 monomial generation.
  - `CycleReservoir` ring topology with leaky integration.
  - `Xorshift64Rng` deterministic PRNG for reservoir/weight initialization.

- **SSM/Mamba — Selective State Space Models** (`neural::ssm`):
  - `StreamingMamba` — Mamba-style selective SSM with input-dependent B, C, Delta
    + RLS readout. Factory: `mamba(d_in, n_state)`.
  - `DiagonalSSM` — non-selective S4D baseline.
  - `SelectiveSSM` — full Mamba architecture in irithyll-core (`no_std`).
  - `MambaPreprocessor` — SSM as `StreamingPreprocessor`.
    Factory: `mamba_preprocessor(...)`.

- **Spiking Neural Networks with e-prop** (`neural::snn`):
  - `SpikeNet` — Vec-based SNN with e-prop online learning, f64 interface.
  - `SpikeNetFixed``no_std` SNN with Q1.14 integer arithmetic throughout
    (full training in fixed-point, no floats).
  - LIF neurons, delta spike encoding, piecewise linear surrogate gradient.
  - Random feedback alignment (no weight transport).
  - `SpikePreprocessor` — SNN as `StreamingPreprocessor`.
    Factory: `spikenet(n_hidden)`.

- **Math utilities:** `softplus()` and `sigmoid()` in irithyll-core math module.

- **All architectures implement `StreamingLearner`** — composable with existing
  pipelines (`pipe()`, `normalizer()`, ensembles, evaluation).

- **432 new tests** (total: 1,997). 31 new files, ~8,600 new lines of code.
  No new external dependencies. All `no_std` compatible.

- irithyll-core: 0.5.0 → 0.6.0. irithyll: 8.4.1 → 9.0.0.

## [8.4.0] - 2026-03-22


### Added


- **irithyll-cli v0.1.0** — command-line interface for streaming ML:
  - `irithyll train` — train from CSV with progress bar, save model to JSON
  - `irithyll predict` — batch predictions from a saved model
  - `irithyll eval` — prequential (test-then-train) evaluation with accuracy/RMSE/MAE/kappa
  - `irithyll inspect` — model summary, feature importances, config dump
  - `irithyll export` — export to packed embedded format
  - `irithyll config` — generate/validate TOML config files
  - **TUI dashboard** (`--tui` flag) — ratatui + Catppuccin Mocha theme, live training
    metrics table, braille loss curve chart, throughput display
  - Auto label-encoding for non-numeric CSV columns (e.g. "UP"/"DOWN")
  - Feature names from CSV headers propagated to model config

## [8.3.0] - 2026-03-22


### Added


- **Python bindings overhaul** (irithyll-python v4.0.0, 490 → 1,287 lines):
  - **Batch API:** `fit(X, y)`, `partial_fit(X, y)`, `predict_batch(X)`, `score(X, y)`
    for numpy 2D arrays with GIL released during computation.
  - **Streaming training:** `train_stream(iterator, callback, every_n)` from Python iterators.
  - **PrequentialEvaluator:** test-then-train evaluation returning accuracy/RMSE/MAE dicts,
    with `evaluate_streaming()` for periodic snapshots.
  - **DistributionalGBT:** Python class wrapping DistributionalSGBT — returns (μ, σ) predictions.
  - **ClassifierGBT:** Python class wrapping MulticlassSGBT — `predict()`, `predict_proba()`,
    batch `fit(X, y)` for multi-class classification.
  - **Config expansion:** 5 new hyperparameter fields (gradient_clip_sigma, max_leaf_output,
    adaptive_leaf_bound, min_hessian_sum, split_reeval_interval).
  - **Jupyter support:** `_repr_html_()` for rich notebook display.
  - **Convenience:** `StreamingGBTConfig.fit(X, y)` creates and trains a model in one call.

## [8.2.3] - 2026-03-21


### Changed


- **Track B complete: full training pipeline migrated to irithyll-core** (v0.5.0).
  Histogram binning, Hoeffding trees, SGBT ensembles, and all 10 ensemble variants
  now live in `irithyll-core` behind `#[cfg(feature = "alloc")]`. The `irithyll`
  crate becomes a thin re-export layer plus std-specific extensions (async streaming,
  Arrow/Parquet, ONNX export, preprocessing, metrics, evaluation, pipeline).
  - **Phase 3:** Histogram module (7 files, 3,236 lines) → irithyll-core
  - **Phase 4:** Tree module (8 files, 6,589 lines) → irithyll-core
  - **Phase 5:** Ensemble core (6 files, 5,164 lines) → irithyll-core
  - **Phase 6:** Ensemble variants (11 files, 8,571 lines) → irithyll-core
  - New core modules: `feature` (FeatureType), `learner` (StreamingLearner trait)
  - All float ops converted to `libm`/`crate::math` for no_std compatibility
  - serde derives gated behind `#[cfg_attr(feature = "serde", ...)]`
  - HashMap-dependent methods gated behind `#[cfg(feature = "std")]`
  - irithyll-core: 686 tests. irithyll: 879 tests. Total: 1,565.

## [8.2.2] - 2026-03-21


### Added


- **Per-leaf adaptive output bounds** -- `adaptive_leaf_bound` config option. Each leaf
  tracks EWMA of its own output weight and clamps predictions to `|mean| + k * std`.
  Synchronized with `leaf_decay_alpha` when `leaf_half_life` is set, Welford online
  otherwise. Warmup (10 samples) falls back to `max_leaf_output` if set.
  - New config field: `SGBTConfig::adaptive_leaf_bound: Option<f64>`
  - New builder method: `.adaptive_leaf_bound(k)`
  - 9 new tests covering warmup, clamping, EWMA adaptation, and backward compat.

## [8.2.1] - 2026-03-21


### Added


- **Airlines + Covertype benchmarks** -- prequential evaluation on 3 real datasets:
  - Electricity (45K, binary, concept drift) -- existing
  - Airlines (539K, binary, large scale flight delay prediction)
  - Covertype (581K, 7-class forest cover type, high dimensionality)
- **Multi-class prequential evaluation** using `MulticlassSGBT` with per-class confusion
  matrix, multi-class Cohen's Kappa, and macro-F1.
- **Learning curve CSV output** -- checkpoint-level accuracy/kappa/F1 written to
  `datasets/results/` for each dataset, enabling accuracy-over-time plots.
- **Memory tracking** -- peak memory estimates reported in benchmark output.
- **Dataset download script** -- `datasets/download.py` fetches Airlines and Covertype
  from standard ML repositories.
- Comparison scripts extended for all 3 datasets (River, XGBoost, LightGBM).
- `comparison/collect_results.py` generates unified multi-dataset BENCHMARKS.md.

## [8.2.0] - 2026-03-21


### Added


- **Real dataset benchmarks** -- prequential evaluation on the Electricity dataset (45K samples,
  binary classification with concept drift). Best config achieves 88.5% accuracy, Kappa 0.762.
  Includes comparison harness for River, XGBoost, and LightGBM.
- **`irithyll-core` v0.4.0: drift detection migration** -- all three drift detectors
  (Page-Hinkley, DDM, ADWIN) are now `no_std + alloc` compatible in `irithyll-core`.
  Trait, signal enum, and state types all available in the core crate.
  - `DriftSignal` enum available without `alloc` (pure `core`).
  - `DriftDetector` trait, `DriftDetectorState`, implementations behind `alloc` feature.
  - ADWIN exponential histogram fully ported (907 lines).
- **`irithyll-core` v0.4.0: error type migration** -- `ConfigError` and `IrithyllError`
  now defined in `irithyll-core` with hand-written `core::fmt::Display` impls.
  `std::error::Error` impl available behind `std` feature.
- Benchmark harness: `benches/real_dataset_bench.rs` with CSV dataset loading,
  prequential evaluation, and detailed results reporting.
- Comparison scripts: `comparison/river/bench_river.py`, `comparison/xgboost/bench_xgb.py`,
  `comparison/collect_results.py`.

### Changed


- `irithyll` now depends on `irithyll-core` with `std` feature (was `alloc` only).
- `irithyll/src/drift/` is now a thin re-export layer over `irithyll-core::drift`.
- `irithyll/src/error.rs` re-exports `ConfigError` from `irithyll-core`.

## [8.1.2] - 2026-03-20


### Added


- **Per-expert MoE configs** -- `MoEDistributionalSGBT::with_expert_configs()` allows each
  expert to use its own `SGBTConfig` (different depths, lambda, learning rates, etc.).
  Shadow respawn uses per-expert config when available.
- **Entropy gate regularization** -- `entropy_weight` field on `MoEDistributionalSGBT`
  adds entropy bonus to gate SGD, preventing collapse to a single expert.
  Gradient: `d(-H)/dz_k = p_k * (log(p_k) + 1) - mean_term`.
- **Distributional packed f32 export** -- `export_distributional_packed()` exports the
  location ensemble of `DistributionalSGBT` to irithyll-core packed binary format.
  Returns `(bytes, location_base)` for f64-precision base handling.
- **Dual-path inference** -- `packed_refresh_interval` config field enables a packed f32
  cache on `DistributionalSGBT` that refreshes every N training samples. `predict()`
  uses cache-optimal BFS traversal when available, falls back to full tree walk.
  `enable_packed_cache(interval)` for runtime activation.
- New accessors on `DistributionalSGBT`: `location_steps()`, `location_base()`,
  `learning_rate()`, `has_packed_cache()`.
- New accessors on `MoEDistributionalSGBT`: `entropy_weight()`, `expert_configs()`.
- Builder method `SGBTConfigBuilder::packed_refresh_interval()`.
- 7 new tests covering all additions.

## [8.1.0] - 2026-03-18


### Added


- **Int16 quantized inference (`irithyll-core`)** -- 8-byte `PackedNodeI16` with integer-only
  tree traversal for FPU-less embedded targets.
  - `QuantizedEnsembleHeader` with "IR16" magic, per-feature quantization scales, leaf scale.
  - `QuantizedEnsembleView` -- zero-copy view with `predict()` (inline quantization, zero-alloc)
    and `predict_prequantized()` (pure integer hot loop, zero float ops).
  - `predict_tree_i16()` / `predict_tree_i16_x4()` -- branch-free integer traversal.
  - `predict_tree_i16_inline()` -- hybrid path with on-the-fly f32-to-i16 quantization.
  - 8 nodes per 64-byte cache line (vs 5 for f32). 48% more nodes in same flash.
  - On Cortex-M0+ (no FPU): integer compare is ~25x faster than software f32 compare.
- **`export_packed_i16()`** -- converts trained SGBT to int16 quantized packed binary.
  - Per-feature scale: `32767 / max(|thresholds|)` for optimal i16 range utilization.
  - Global leaf scale with i32 accumulation, single dequantization at prediction end.
  - `validate_export_i16()` for accuracy comparison against original model.
- **`math` module (`irithyll-core`)** -- libm-backed f64 math operations for no_std.
  - 20 functions: abs, sqrt, exp, ln, log2, powf, powi, sin, cos, tanh, floor, ceil, round, etc.
  - Foundation for Track B (migrating training algorithms to irithyll-core).

### Changed


- `irithyll-core` now depends on `libm` 0.2 (pure Rust, no_std, thumbv6m compatible).
- `irithyll-core` version bumped to 0.2.0 (new public types + dependency).

## [8.0.1] - 2026-03-18


### Added


- **`MoEDistributionalSGBT`** -- Mixture of Experts over `DistributionalSGBT`
  ensembles with shadow expert competition via Hoeffding bounds.
  - K active experts + K shadow challengers, each a full `DistributionalSGBT`.
  - Learned softmax gate: `W · x + b → softmax → routing probabilities`.
  - Shadow competition uses Gaussian NLL difference with Hoeffding bound test.
  - When shadow proves statistically superior, it replaces the active expert.
  - Gate-weighted mixture prediction via law of total variance.
  - Supports soft gating (all experts see all samples) and hard top-k routing.
  - `predict()` returns `GaussianPrediction { mu, sigma }` with correct mixture
    variance accounting for both within-expert and between-expert variance.

## [8.0.0] - 2026-03-18


### Added


- **`irithyll-core` crate** — new `#![no_std]` zero-alloc inference engine for
  deploying trained models on embedded targets (Cortex-M0+, 32KB flash).
  - 12-byte `PackedNode` AoS format (5 nodes per cache line).
  - `EnsembleView<'a>` — zero-copy view over `&[u8]`, validated once at construction.
  - Branch-free `predict_tree()` traversal (`cmov`/`csel`, no pipeline stalls).
  - `predict_tree_x4()` interleaved batch prediction exploiting ILP.
  - f64→f32 quantization with tolerance validation.
- **`export_embedded` module** — converts trained `SGBT` to packed binary format.
  - BFS tree reindexing with contiguous u16 node indices.
  - Learning rate baked into leaf values (one less multiply per tree).
  - `validate_export()` for comparing original vs packed predictions.
- **Re-exports**`EnsembleView`, `PackedNode`, `FormatError` available from
  root `irithyll` crate via `pub use irithyll_core::*`.
- **CI**`thumbv6m-none-eabi` cross-compilation check for `irithyll-core`,
  workspace-wide clippy and docs.
- **Benchmark**`packed_bench` comparing packed inference vs standard SGBT prediction.

### Changed


- Workspace restructured: `irithyll-core` added as workspace member.
- `irithyll` now depends on `irithyll-core` (path dependency).

## [7.9.1] - 2026-03-17


### Fixed


- **Cascading swap bug in graduated handoff.** When a shadow tree was promoted
  to active, its total `n_samples_seen()` already exceeded `max_tree_samples`,
  causing an immediate second swap that promoted a cold stump. Added
  `samples_at_activation` tracking: time-based replacement and blend weight
  decay now use samples-since-activation, not total lifetime sample count.

## [7.9.0] - 2026-03-17


### Added


- **Graduated tree handoff** -- eliminates prediction dips during tree
  replacement in streaming GBDT ensembles (Boulevard/GOOWE-inspired).
- **`shadow_warmup`** -- new `SGBTConfig` option enabling always-on shadow
  trees. When set, each tree slot immediately spawns a shadow that trains
  alongside the active tree. After `shadow_warmup` samples, the shadow
  contributes to graduated predictions.
- **`predict_graduated()`** -- on `SGBT`, `DistributionalSGBT`, `TreeSlot`,
  and `BoostingStep`. Blends active and shadow predictions based on
  relative maturity: active weight decays linearly from 1.0 (at 80% of
  `max_tree_samples`) to 0.0 (at 120%), shadow weight ramps from 0.0
  to 1.0 over `shadow_warmup` samples. When active fully decays, shadow
  is promoted and a new shadow spawns -- zero cold-start.
- **`predict_graduated_sibling_interpolated()`** -- premium prediction path
  combining graduated handoff (no replacement dips) with sibling-based
  spatial interpolation (no step-function artifacts). The smoothest
  possible prediction surface for streaming GBDT.
- Soft replacement: when active tree's weight reaches 0, the shadow is
  promoted to active and a new shadow is spawned immediately. The drift
  detector is reset. No moment where a cold tree serves predictions.

## [7.8.2] - 2026-03-17


### Added


- **`predict_sibling_interpolated()`** -- new prediction method on `SGBT`,
  `DistributionalSGBT`, and `HoeffdingTree`. At each split node, linearly
  blends left and right subtree predictions based on the feature's distance
  from the threshold, using auto-calibrated bandwidths as the interpolation
  margin. Predictions vary continuously as features move near split
  boundaries, eliminating step-function artifacts. This complements the
  existing `predict_interpolated()` (temporal smoothing) with spatial
  smoothing in feature space.

## [7.8.1] - 2026-03-17


### Added


- **`max_leaf_output`** -- new `SGBTConfig`/`TreeConfig` option clamping leaf
  predictions to `[-max, max]`. Breaks the feedback loop that causes
  prediction explosions in streaming settings.
- **`min_hessian_sum`** -- new `SGBTConfig`/`TreeConfig` option suppressing
  leaf output until hessian accumulation exceeds the threshold. Prevents
  post-replacement spikes from fresh leaves with insufficient samples.
- **`predict_interpolated()`** -- new prediction method on `SGBT`,
  `DistributionalSGBT`, and `HoeffdingTree`. Blends leaf predictions with
  parent node predictions via `alpha = hess / (hess + lambda)`, fixing
  static EWRV/OIF predictions from recently-split leaves.
- **Ensemble-level gradient statistics** -- `DistributionalSGBT` now tracks
  Welford running mean/variance of gradients across all trees. Exposed via
  `ensemble_grad_mean()`, `ensemble_grad_std()`, and in `ModelDiagnostics`.
  Enables new trees to inherit the ensemble's gradient distribution,
  fixing the cold-start flaw in `gradient_clip_sigma`.
- **`huber_k`** -- new `SGBTConfig` option enabling Huber loss for
  `DistributionalSGBT` location gradients with adaptive
  `delta = k * empirical_sigma`. Standard value 1.345 (95% Gaussian
  efficiency). Bounds gradients by construction without per-leaf clipping.

## [7.8.0] - 2026-03-16


### Added


- **Preprocessing expansion** -- 5 new streaming preprocessors for feature
  engineering in online pipelines.
- **`FeatureHasher`** -- hashing trick (Weinberger et al., 2009) for
  fixed-size dimensionality reduction. Dual-seed multiply-xorshift hash
  with sign bit preserving inner products in expectation. Implements
  `StreamingPreprocessor` (stateless).
- **`OneHotEncoder`** -- streaming one-hot encoding with online category
  discovery. Configurable `max_categories` cap per feature. Sorted
  category storage for deterministic encoding. Implements
  `StreamingPreprocessor`.
- **`TargetEncoder`** -- Bayesian-smoothed target encoding for categorical
  features. Formula: `(n * cat_mean + m * global_mean) / (n + m)`.
  Standalone (requires target, does not implement `StreamingPreprocessor`).
- **`MinMaxScaler`** -- streaming min/max tracking with configurable output
  range (default `[0, 1]`). Constant features map to range midpoint.
  Implements `StreamingPreprocessor`.
- **`PolynomialFeatures`** -- degree-2 polynomial feature generation
  (squares + pairwise interactions). Optional `interaction_only` mode
  for cross-terms without squared terms. Stateless, implements
  `StreamingPreprocessor`.
- New factory functions: `feature_hasher()`, `min_max_scaler()`, `one_hot()`,
  `polynomial_features()`, `target_encoder()`.
- New re-exports at crate root: `FeatureHasher`, `MinMaxScaler`,
  `OneHotEncoder`, `PolynomialFeatures`, `TargetEncoder`.

## [7.7.0] - 2026-03-16


### Added


- **Multi-armed bandits** (`bandits` module) -- online decision-making algorithms
  for the exploration-exploitation trade-off in streaming settings.
- **`Bandit` trait** -- unified interface for context-free bandits: `select_arm()`,
  `update(arm, reward)`, `arm_values()`, `arm_counts()`.
- **`ContextualBandit` trait** -- feature-conditioned arm selection:
  `select_arm(context)`, `update(arm, context, reward)`.
- **`EpsilonGreedy`** -- random exploration with probability epsilon (optionally
  decaying via exponential schedule). Constructors: `new()`, `with_seed()`,
  `with_decay()`.
- **`UCB1`** -- Upper Confidence Bound (Auer et al., 2002). Deterministic,
  round-robin initial exploration, `Q(a) + sqrt(2 ln(t) / N(a))` selection.
- **`UCBTuned`** -- UCB with per-arm variance estimates for tighter confidence
  bounds. Tracks sum of squared rewards for empirical variance computation.
- **`ThompsonSampling`** -- Bayesian arm selection via Beta posterior sampling.
  Marsaglia-Tsang Gamma sampler + Box-Muller normals for dependency-free Beta
  samples. Constructors: `new()`, `with_seed()`, `with_prior()`.
- **`LinUCB`** -- contextual bandit with per-arm ridge regression and
  Sherman-Morrison inverse updates (Li et al., 2010). O(d²) per update,
  O(kd²) total space for k arms with d features.
- New factory functions: `epsilon_greedy()`, `ucb1()`, `ucb_tuned()`,
  `thompson()`, `lin_ucb()`.
- New re-exports at crate root: `Bandit`, `ContextualBandit`, `EpsilonGreedy`,
  `UCB1`, `UCBTuned`, `ThompsonSampling`, `LinUCB`.

## [7.6.1] - 2026-03-16


### Changed


- **Auto-bandwidth smooth routing** -- `predict()` on both `SGBT` and
  `DistributionalSGBT` now always uses sigmoid-blended soft routing with
  per-feature auto-calibrated bandwidths. Bandwidths are computed as
  `median_gap × 2.0` from the split thresholds of all trees in the ensemble.
  Features with fewer than 3 unique thresholds fall back to `range / n_bins × 2`,
  and features never split on use hard routing (bandwidth = ∞).
- **Removed `bandwidth` from `SGBTConfig`** -- the `bandwidth()` builder method,
  field, and validation have been removed. Auto-bandwidth is the only mode.
- Bandwidth cache refreshes automatically when tree replacements (drift or
  time-based) are detected, via a lightweight replacement counter on `TreeSlot`.

### Added


- `HoeffdingTree::collect_split_thresholds_per_feature()` -- collects all
  continuous split thresholds per feature from the tree arena.
- `HoeffdingTree::predict_smooth_auto()` -- per-feature bandwidth smooth
  prediction.
- `TreeSlot::replacements()` -- counts tree replacements for cache invalidation.
- `TreeSlot::prediction_mean()` / `prediction_std()` -- Welford online stats
  tracking per-tree prediction distribution for anomaly detection.
- `SGBT::auto_bandwidths()` / `DistributionalSGBT::auto_bandwidths()` --
  exposes the current per-feature bandwidth vector.
- `auto_bandwidths: Vec<f64>` in `ModelDiagnostics`.
- `prediction_mean` / `prediction_std` in `TreeDiagnostic`.

### Fixed


- Clippy `needless_range_loop` warning in Holt-Winters seasonal initialization.

## [7.6.0] - 2026-03-16


### Added


- **SNARIMAX** (`SNARIMAX`) -- streaming non-linear ARIMA with seasonal,
  exogenous, and autoregressive/moving-average components. Online SGD
  parameter updates with gradient clipping for stability on unnormalized data.
  Configurable AR/MA orders, seasonal period, and exogenous feature count.
  Multi-step recursive forecasting via `forecast(horizon)`.
- **Holt-Winters** (`HoltWinters`) -- streaming triple exponential smoothing
  with additive and multiplicative seasonality modes. Buffered initialization
  from the first seasonal period. Level, trend, and seasonal components
  updated incrementally per observation. Multi-step forecasting.
- **Streaming Decomposition** (`StreamingDecomposition`) -- online seasonal
  decomposition into trend, seasonal, and residual components. EWMA-based
  trend estimation with per-position seasonal factor tracking. Returns
  `DecomposedPoint` with the identity `observed = trend + seasonal + residual`.
- `Seasonality` enum (`Additive` / `Multiplicative`) for Holt-Winters.
- `DecomposedPoint` struct for decomposition output.
- `SNARIMAXCoefficients` snapshot struct for model introspection.
- New re-exports at crate root: `SNARIMAX`, `SNARIMAXConfig`,
  `SNARIMAXCoefficients`, `HoltWinters`, `HoltWintersConfig`, `Seasonality`,
  `StreamingDecomposition`, `DecompositionConfig`, `DecomposedPoint`.

## [7.5.0] - 2026-03-16


### Added


- **Hoeffding Tree Classifier** (`HoeffdingTreeClassifier`) -- standalone streaming
  decision tree for classification based on the VFDT algorithm (Domingos & Hulten,
  2000). Maintains per-leaf class distributions, splits using information gain with
  Hoeffding bound. Configurable via `HoeffdingClassifierConfig` builder. Implements
  `StreamingLearner`.
- **Multinomial Naive Bayes** (`MultinomialNB`) -- streaming classifier for
  count/frequency features with Laplace smoothing. Automatic class discovery,
  predict_proba/predict_log_proba support.
- **Bernoulli Naive Bayes** (`BernoulliNB`) -- streaming classifier for binary
  features with configurable binarization threshold. Explicitly models feature
  absence unlike Multinomial NB.
- **Adaptive Random Forest** (`AdaptiveRandomForest`) -- ensemble of streaming
  learners with ADWIN drift detection and automatic tree replacement
  (Gomes et al., 2017). Poisson(lambda)-weighted bootstrap, random feature
  subspaces, generic over any `StreamingLearner` via factory closure.
- New re-exports at crate root: `HoeffdingTreeClassifier`, `MultinomialNB`,
  `BernoulliNB`, `AdaptiveRandomForest`.

## [7.4.0] - 2026-03-16


### Added


- **Streaming K-Means** (`StreamingKMeans`) -- mini-batch online K-Means with
  optional forgetting factor for non-stationary streams (Sculley, 2010). Lazy
  initialization from first k distinct samples. Decaying learning rate
  `eta = 1/count` per centroid. Configurable via `StreamingKMeansConfig` builder.
- **DBSTREAM** (`DBStream`) -- density-based streaming clustering with weighted
  micro-clusters and shared-density graph for macro-cluster merging
  (Hahsler & Bolanos, 2016). Exponential decay, periodic cleanup of dead MCs,
  connected-component macro-cluster extraction via DFS.
- **CluStream** (`CluStream`) -- micro/macro streaming clustering via Cluster
  Feature vectors (Aggarwal et al., 2003). Online phase maintains CF summaries
  with absorb/merge logic. Offline phase produces k macro-clusters via weighted
  K-Means on micro-cluster centers.
- `ClusterFeature` -- sufficient statistics `(n, LS, SS)` for streaming cluster
  summarization. Supports absorb, merge, center, and radius computation.
- `MicroCluster` -- weighted centroid with creation time for DBSTREAM.
- New re-exports at crate root: `StreamingKMeans`, `StreamingKMeansConfig`,
  `DBStream`, `DBStreamConfig`, `MicroCluster`, `CluStream`, `CluStreamConfig`,
  `ClusterFeature`.

## [7.3.0] - 2026-03-16


### Added


- **Prequential evaluation** -- `PrequentialEvaluator` implements the standard
  streaming ML evaluation protocol (Gama et al., 2009): test-then-train on each
  sample. Configurable warmup period and step interval. Optional rolling-window
  and EWMA metric tracking alongside cumulative `RegressionMetrics`.
- **Progressive validation** -- `ProgressiveValidator` with configurable holdout
  strategies: `HoldoutStrategy::None` (pure prequential), `Periodic { period }`
  (every N-th sample held out), `Random { holdout_fraction, seed }` (stochastic
  holdout with deterministic xorshift64 PRNG). All strategies always evaluate;
  only training is gated.
- **Cohen's Kappa** (`CohenKappa`) -- streaming classification agreement metric
  adjusted for chance. Auto-growing confusion matrix discovers new classes on
  the fly. Tie-breaking by lowest class index.
- **Kappa-M** (`KappaM`) -- compares model against a majority-class baseline
  classifier. Tracks per-class counts to determine the current majority class.
- **Kappa-T** (`KappaT`) -- compares model against a no-change (temporal)
  classifier that always predicts the previous true label.
- **Streaming AUC-ROC** (`StreamingAUC`) -- windowed approximation of AUC-ROC
  via the Wilcoxon-Mann-Whitney U statistic over a fixed-size circular buffer
  of `(score, is_positive)` pairs. O(n^2) over the window per computation.
- New re-exports at crate root: `PrequentialEvaluator`, `PrequentialConfig`,
  `ProgressiveValidator`, `HoldoutStrategy`, `CohenKappa`, `KappaM`, `KappaT`,
  `StreamingAUC`.

## [7.2.0] - 2026-03-16


### Added


- **PD sigma modulation** -- proportional-derivative controller for uncertainty-
  modulated learning rate in `DistributionalSGBT`. Tracks `sigma_velocity`
  (EWMA-smoothed derivative of empirical σ) with self-calibrating derivative
  gain `k_d = |sigma_velocity| / rolling_sigma_mean`. The PD ratio
  `(sigma + k_d * sigma_velocity) / rolling_sigma_mean` anticipates regime
  changes, boosting the learning rate *before* errors fully propagate.
  New `sigma_velocity()` getter exposes the signal.
- **Smooth prediction on SGBT** -- `predict_smooth(features, bandwidth)` now
  available on `SGBT` (not just `DistributionalSGBT`), aggregating per-tree
  smooth predictions through the full boosting ensemble.
- **Config-driven smooth prediction** -- `bandwidth: Option<f64>` field on
  `SGBTConfig`. When set, `predict()` on both `SGBT` and `DistributionalSGBT`
  automatically uses sigmoid-blended soft routing with no API change required.
  Builder method: `.bandwidth(0.5)`. Validated as positive and finite.

## [7.1.0] - 2026-03-16


### Added


- **Smooth prediction** (`predict_smooth`) -- sigmoid-blended soft routing
  through tree split nodes. Instead of hard left/right decisions, each split
  uses `alpha = sigmoid((threshold - feature) / bandwidth)` and recursively
  blends subtree predictions. Produces a continuous function with no bins,
  boundaries, or jumps. Available on `HoeffdingTree`, `TreeSlot`,
  `BoostingStep`, and `DistributionalSGBT`. Bandwidth parameter controls
  transition sharpness.
- **Leaf state accessor** (`HoeffdingTree::leaf_grad_hess`) -- read-only access
  to per-leaf gradient and hessian sums. Enables inverse-hessian confidence
  estimation: `confidence = 1.0 / (hess_sum + lambda)`.
- **Per-leaf sample counts in diagnostics** -- `TreeDiagnostic` now includes
  `leaf_sample_counts: Vec<u64>` showing data distribution across leaves.
- Internal `leaf_prediction` helper on `HoeffdingTree` reducing code duplication
  across `predict`, `predict_with_variance`, and `predict_smooth`.

## [7.0.0] - 2026-03-15


### Added


- **Empirical σ estimation** (`ScaleMode::Empirical`, now default) -- replaces
  the scale tree chain with an EWMA of squared prediction errors:
  `σ = sqrt(ewma_sq_err)`.  Always calibrated, zero tuning, O(1) compute.
  The scale tree chain was frozen in practice (scale gradients too weak for
  Hoeffding trees to split), making σ-modulated learning a no-op.  Empirical σ
  is self-calibrating by definition: high recent errors → σ large → location LR
  scales up → model corrects faster.
- `ScaleMode` enum (`Empirical` / `TreeChain`) on `SGBTConfig` for choosing
  the σ estimation strategy.  `ScaleMode::TreeChain` preserves the full
  dual-chain NGBoost behavior for users who need feature-conditional uncertainty.
- `empirical_sigma_alpha` config parameter (default `0.01`) -- controls EWMA
  adaptation speed for empirical σ.
- `empirical_sigma()` method on `DistributionalSGBT` -- returns current σ.
- `scale_mode()` method on `DistributionalSGBT`.
- `ModelDiagnostics` now includes `empirical_sigma`, `scale_mode`,
  `scale_trees_active`, and separate `location_trees` / `scale_trees` vectors
  for independent inspection.
- `ScaleMode` re-exported at crate root.
- ASCII art header in README.
- **DistributionalSGBT diagnostics** -- three new diagnostic capabilities:
  - `ModelDiagnostics` struct with per-tree summaries (leaf count, max depth,
    samples seen, leaf weight statistics, split features) and global feature
    split counts. Access via `model.diagnostics()`.
  - `DecomposedPrediction` struct with `predict_decomposed(features)` -- returns
    per-tree additive contributions for both location and scale ensembles.
    Includes `mu()`, `log_sigma()`, `sigma()` reconstruction methods.
  - `feature_importances()` and `feature_importances_split()` -- normalized
    split-gain-weighted feature importance vectors, with optional separate
    location/scale views.
- `TreeDiagnostic`, `ModelDiagnostics`, `DecomposedPrediction` re-exported at
  crate root.

### Changed


- **BREAKING:** `DistributionalSGBT` defaults to empirical σ (`ScaleMode::Empirical`).
  Users who relied on the dual-chain behavior must explicitly set
  `scale_mode(ScaleMode::TreeChain)` in their config builder.
- `ModelDiagnostics` struct has new fields: `location_trees`, `scale_trees`,
  `empirical_sigma`, `scale_mode`, `scale_trees_active`.

## [6.5.1] - 2026-03-15


### Changed


- Made AdaGrad optional on `LinearLeafModel` via `use_adagrad: bool` field on
  `LeafModelType::Linear`. Default is `false` (plain Newton-scaled SGD).

## [6.5.0] - 2026-03-14


### Added


- **Adaptive leaf promotion** -- leaves start as constant (zero overhead) and
  auto-promote to a trainable model when the Hoeffding bound confirms it is
  statistically superior. Uses the tree's existing `delta` parameter -- no
  arbitrary thresholds. New `LeafModelType::Adaptive { promote_to }` variant.
- **Optional AdaGrad optimization for linear leaves** -- `use_adagrad: true`
  enables per-weight squared gradient accumulators, giving each feature its
  own adaptive learning rate. Default is `false` (plain Newton-scaled SGD).
- **Exponential forgetting** -- optional `decay` parameter on `Linear` and `MLP`
  leaf models. Applies exponential weight decay before each update, giving the
  model a finite memory horizon for non-stationary streams.
- **Warm-start on split** -- new `LeafModel::clone_warm()` trait method. When a
  leaf splits, child leaves inherit the parent's learned weights (resetting
  optimizer state), converging faster than starting from scratch.
- Integration tests for adaptive leaves, decay leaves, warm-start cloning.

### Changed


- `LeafModelType::Linear` now includes `decay: Option<f64>` and
  `use_adagrad: bool` fields (both serde-defaulted). `LeafModelType::MLP`
  includes `decay: Option<f64>`. Existing configs deserialize correctly.
- `LeafModelType::create()` now takes an additional `delta: f64` parameter
  for adaptive leaf construction.
- Tree cloning now uses `clone_warm()` instead of `clone_fresh()`, preserving
  learned leaf model weights across clone operations.
- Child leaves created during splits warm-start from the parent's model
  instead of initializing from scratch.

## [6.4.0] - 2026-03-14


### Added


- **Pluggable leaf models** -- leaves can now use trainable prediction models
  instead of constant weights. Three variants via `LeafModelType`:
  - `ClosedForm` (default) -- standard constant leaf weight, zero overhead.
  - `Linear { learning_rate }` -- per-leaf online ridge regression. Each leaf
    learns a local linear surface `w . x + b` via Newton-scaled gradient descent.
    Significantly improves accuracy for low-depth trees (depth 2-4).
  - `MLP { hidden_size, learning_rate }` -- per-leaf single-hidden-layer neural
    network with ReLU activation and backpropagation.
- `LeafModelType` re-exported at crate root for ergonomic access.
- `SGBTConfig::builder().leaf_model_type(...)` for ensemble-level configuration.
- Full integration across all ensemble variants: SGBT, DistributionalSGBT,
  ParallelSGBT.

### Changed


- Leaf model module (`tree::leaf_model`) is now always compiled -- previously
  gated behind the `neural-leaves` feature flag. No external dependencies added.
- Removed all em-dash characters from source code for consistent ASCII encoding.

## [6.3.0] - 2026-03-13


### Added


- **Kernel Recursive Least Squares (KRLS)** -- non-linear streaming regression in
  reproducing kernel Hilbert space. Implements Engel, Mannor & Meir (2004) with
  ALD (Approximate Linear Dependency) sparsification for automatic dictionary
  pruning. Configurable budget caps dictionary size; once full, new samples fall
  through to weight-only updates. Three kernel implementations: `RBFKernel`,
  `PolynomialKernel`, `LinearKernel` via the object-safe `Kernel` trait. O(N²)
  per sample where N ≤ budget.
- **CCIPCA** (Candid Covariance-free Incremental PCA) -- streaming dimensionality
  reduction via Weng, Zhang & Hwang (2003). Incrementally estimates leading
  eigenvectors of the covariance matrix in O(kd) time per sample without ever
  forming the covariance matrix. Configurable amnestic parameter for non-stationary
  streams. Implements `StreamingPreprocessor` for pipeline integration.
- **RLS confidence intervals** -- `RecursiveLeastSquares` now provides
  `prediction_variance()`, `prediction_std()`, `predict_interval(z)`, and
  `noise_variance()` for lightweight uncertainty quantification. Variance combines
  noise estimate (EWMA of squared residuals) with parameter uncertainty through
  the P matrix.
- **DistributionalSGBT serialization** -- `to_distributional_state()`,
  `from_distributional_state()` with full round-trip fidelity including drift
  detector state, rolling sigma, and alternate trees. `save_distributional_model()`
  / `load_distributional_model()` and bincode equivalents.
- Factory functions: `krls()`, `ccipca()` for ergonomic construction.

### Fixed


- Broken intra-doc links for `DistributionalModelState` and missing
  `StreamingLearner` import in `Pipeline` doc example.

## [6.2.0] - 2026-03-13


### Added


- **Pipeline composition** -- `PipelineBuilder` for chaining `StreamingPreprocessor`
  steps with a terminal `StreamingLearner`. `Pipeline` itself implements
  `StreamingLearner`, enabling recursive composition and stacking. Training updates
  preprocessor statistics; prediction uses frozen transforms.
- **`StreamingPreprocessor` trait** -- object-safe trait for streaming feature
  transformers with `update_and_transform()` (training) and `transform()` (prediction).
- **`IncrementalNormalizer`** -- Welford online standardization implementing
  `StreamingPreprocessor`. Zero-mean, unit-variance with lazy feature count
  initialization and configurable variance floor.
- **`OnlineFeatureSelector`** -- EWMA importance tracking with dynamic feature
  masking after configurable warmup period.
- **`AdaptiveSGBT`** -- SGBT wrapper with pluggable learning rate schedulers.
  Five built-in schedulers: `ConstantLR`, `LinearDecayLR`, `ExponentialDecayLR`,
  `CosineAnnealingLR`, `PlateauLR`.
- Factory functions for ergonomic model construction: `sgbt()`, `linear()`,
  `rls()`, `gaussian_nb()`, `mondrian()`, `normalizer()`, `pipe()`,
  `adaptive_sgbt()`.

## [6.1.1] - 2026-03-13


### Added


- **σ-modulated learning rate** for `DistributionalSGBT` -- when `uncertainty_modulated_lr`
  is enabled, the location (μ) ensemble's learning rate is scaled by
  `sigma_ratio = current_sigma / rolling_sigma_mean`. The model learns μ faster during
  high-uncertainty regimes and conserves during stable periods. The scale (σ) ensemble
  always trains at the unmodulated base rate to prevent positive feedback loops.
  Rolling sigma mean uses slow EWMA (alpha = 0.001), initialized from initial targets std.
- `predict_distributional()` method on `DistributionalSGBT` returning `(mu, sigma, sigma_ratio)`
  for real-time monitoring of the effective learning rate.
- `rolling_sigma_mean()` and `is_uncertainty_modulated()` accessors on `DistributionalSGBT`.
- `uncertainty_modulated_lr()` builder method on `SGBTConfigBuilder`.

### Fixed


- Rustdoc warnings causing CI Documentation job failure: unresolved `StreamingLearner` link,
  private `MondrianTree` link, and redundant explicit link targets in learners module.

## [6.1.0] - 2026-03-12


### Added


- **StreamingLearner trait** -- object-safe `StreamingLearner` trait for polymorphic model
  composition. `train_one()`, `predict()`, `n_samples_seen()`, `reset()` with default
  `train()` (unit weight) and `predict_batch()`. `SGBTLearner<L>` adapter wraps any
  `SGBT<L>` into the trait. Supports `Box<dyn StreamingLearner>` for runtime-polymorphic
  stacking ensembles.
- **Mixture of Experts** -- `MoESGBT` with K specialist SGBT experts and a learned
  linear softmax gating network. Gate updated via online SGD cross-entropy toward the
  best-performing expert. Soft gating (all experts weighted) and Hard gating (top-k only).
  Methods: `predict_with_gating()`, `gating_probabilities()`, `expert_losses()`.
- **Model stacking meta-learner** -- `StackedEnsemble` combining heterogeneous
  `Box<dyn StreamingLearner>` base learners through a trainable meta-learner. Temporal
  holdout prevents leakage: base predictions collected before training bases. Optional
  feature passthrough. Itself implements `StreamingLearner` for recursive stacking.
- **Learning rate scheduling** -- `LRScheduler` trait with 5 implementations:
  `ConstantLR`, `LinearDecayLR`, `ExponentialDecayLR`, `CosineAnnealingLR`, `PlateauLR`.
  `SGBT::set_learning_rate()` method for external scheduler integration.
- **Streaming linear model** -- `StreamingLinearModel` with SGD and pluggable
  regularization: `None`, `Ridge` (L2 weight decay), `Lasso` (L1 proximal/soft-threshold),
  `ElasticNet` (combined L1+L2). Lazy feature initialization. Implements `StreamingLearner`.
- **Gaussian Naive Bayes** -- `GaussianNB` incremental classifier with per-class Welford
  mean/variance tracking. Weighted samples, automatic class discovery, sklearn-style
  `var_smoothing`. `predict_proba()` and `predict_log_proba()` for probability output.
  Implements `StreamingLearner`.
- **Mondrian Forest** -- `MondrianForest` online random forest with arena-based SoA storage.
  Feature-range-proportional random splits, configurable via `MondrianForestConfig` builder
  (n_trees, max_depth, lifetime, seed). Implements `StreamingLearner`.
- **Recursive Least Squares** -- `RecursiveLeastSquares` exact streaming OLS via
  Sherman-Morrison matrix inversion lemma. O(d²) per sample, forgetting factor for
  non-stationary environments. Implements `StreamingLearner`.
- **Streaming polynomial regression** -- `StreamingPolynomialRegression` wrapping RLS
  with online polynomial feature expansion up to arbitrary degree.
  Implements `StreamingLearner`.
- **Locally weighted regression** -- `LocallyWeightedRegression` Nadaraya-Watson kernel
  regression over a fixed-capacity circular buffer with Gaussian kernel weighting.
  Implements `StreamingLearner`.
- **Incremental normalizer** -- `IncrementalNormalizer` with Welford online mean/variance
  per feature. `update()`, `transform()`, `update_and_transform()`, `transform_in_place()`.
  Lazy feature count initialization.
- **Online feature selector** -- `OnlineFeatureSelector` with EWMA importance tracking
  and configurable `keep_fraction`. Masks low-importance features after warmup period.

## [6.0.0] - 2026-03-12


### Added


- **NGBoost distributional output** -- `DistributionalSGBT` maintains two independent
  streaming tree ensembles (location + scale) to output a full Gaussian N(mu, sigma^2)
  predictive distribution. Scale parameterized in log-space for positivity. Gradients
  derived from Gaussian NLL (Duan et al., 2020). Enables per-prediction uncertainty
  quantification with `predict() -> GaussianPrediction { mu, sigma, log_sigma }` and
  `predict_interval(alpha)` for calibrated prediction intervals.
- **Error-weighted sample importance** -- streaming AdaBoost-style gradient reweighting.
  Samples with high absolute error (relative to a rolling EWMA mean) receive amplified
  gradients, forcing the ensemble to focus on hard examples. Configurable via
  `SGBTConfig::builder().error_weight_alpha(0.01)`. Weight capped at 10x to prevent
  instability.
- **Quality-based tree pruning** -- per-step EWMA tracking of |marginal contribution|.
  Trees contributing below a threshold for `patience` consecutive samples are reset,
  eliminating dead wood from extinct regimes. Configurable via `quality_prune_alpha`,
  `quality_prune_threshold`, `quality_prune_patience` in SGBTConfig.
- **Half-Space Trees anomaly detection** -- `HalfSpaceTree` implements the HS-Tree
  algorithm (Tan, Ting & Liu, 2011) for streaming anomaly detection. Random axis-aligned
  partitions with mass-based scoring, window rotation for reference/latest profiles.
  `score_and_update()` returns normalized anomaly scores in [0, 1]. Configurable via
  `HSTConfig` with n_trees, max_depth, window_size, and explicit feature ranges.
- **Serde state for v6 fields** -- `ModelState` now persists `rolling_mean_error`,
  `contribution_ewma`, and `low_contrib_count` across save/load cycles with
  `#[serde(default)]` for backward compatibility with v5 checkpoints.

## [5.1.0] - 2026-03-12


### Added


- **Gradient clipping per leaf** -- `SGBTConfig::builder().gradient_clip_sigma(3.0)` clips
  gradient/hessian outliers beyond N standard deviations, preventing explosive leaf weights
  from noisy streams.
- **`predict_with_confidence()`** -- returns `(prediction, variance)` using per-leaf
  variance estimates. Enables lightweight uncertainty quantification without the full
  distributional model.
- **Monotonic constraints** -- `SGBTConfig::builder().monotone_constraints(vec![1, -1, 0])`
  enforces feature-level monotonicity during tree growth. +1 = increasing, -1 = decreasing,
  0 = unconstrained.
- **`train_batch_with_callback()`** -- train on a slice of samples with a user-supplied
  callback invoked every N samples for logging, metrics, checkpointing, etc.
- **`train_batch_subsampled()`** -- train on a batch with probabilistic subsampling per
  sample, useful for large-batch streaming scenarios.

## [5.0.0] - 2026-03-11


### Added


- **Categorical feature handling** -- Fisher optimal binary partitioning for categorical
  splits in Hoeffding trees. Categories are sorted by gradient/hessian ratio, then the
  existing XGBoostGain evaluator finds the best contiguous partition. Split routing uses
  a `u64` bitmask (up to 64 categories per feature). Configure via
  `SGBTConfig::builder().feature_types(vec![FeatureType::Categorical, FeatureType::Continuous])`.
  `CategoricalBinning` strategy creates one bin per observed distinct value.
- **BaggedSGBT (Oza online bagging)** -- `BaggedSGBT` wraps M independent `SGBT<L>`
  instances with Poisson(1) weighting per bag per sample. Implements the SGB(Oza)
  algorithm from Gunasekara et al. (2025) for variance reduction in streaming gradient
  boosted regression. Final prediction is the mean across all bags. Deterministic
  seeding with xorshift64 PRNG and Knuth's Poisson sampling.
- **Non-crossing multi-quantile regression** -- `QuantileRegressorSGBT` trains K
  independent `SGBT<QuantileLoss>` models (one per quantile level) with PAVA (Pool
  Adjacent Violators Algorithm) post-prediction to enforce monotonicity. O(K) isotonic
  regression guarantees `predict(tau_i) <= predict(tau_j)` for `tau_i < tau_j`.
  `predict_interval(features, lower_tau, upper_tau)` returns calibrated prediction
  intervals. Auto-sorts and validates quantile levels at construction time.
- **`FeatureType` enum** -- `FeatureType::Continuous` and `FeatureType::Categorical`
  propagated from `SGBTConfig` through `TreeConfig` to leaf state binner creation.
  Backward-compatible: `feature_types` defaults to `None` (all continuous).
- **Categorical bitmask routing in `TreeArena`** -- `categorical_mask: Vec<Option<u64>>`
  stored per node. `split_leaf_categorical()` sets bitmask; `route_to_leaf()` dispatches
  via `(mask >> cat_val) & 1`. Serializable with `#[serde(default)]` for backward compat.

## [4.0.0] - 2026-03-11


### Added


- **Expectile loss** -- `ExpectileLoss { tau }` for asymmetric L2 regression. Natively
  compatible with second-order boosting (well-defined positive Hessian for all tau).
  `tau > 0.5` penalizes under-prediction; `tau < 0.5` penalizes over-prediction.
- **Quantile (pinball) loss** -- `QuantileLoss { tau }` for conditional quantile
  regression. Uses hessian=1 pseudo-Huber trick for streaming tree compatibility,
  matching the approach used by LightGBM and XGBoost internally.
- **Rolling metrics** -- `RollingRegressionMetrics` and `RollingClassificationMetrics`
  with sliding window + revert pattern. Sliding Welford for O(1) R-squared over
  finite windows. MAE, MSE, RMSE, accuracy, precision, recall, F1, log loss.
- **EWMA metrics** -- `EwmaRegressionMetrics` and `EwmaClassificationMetrics` for
  exponentially weighted metric tracking. O(1) memory, no buffer needed.
- **Adaptive Conformal Intervals** -- `AdaptiveConformalInterval` implements ACI
  (Gibbs & Candes, 2021) for distribution-free prediction intervals under drift.
  Adapts miscoverage rate online to maintain target coverage probability.
- **Feature importance drift monitor** -- `ImportanceDriftMonitor` feeds per-feature
  |SHAP| values into drift detectors (PHT, ADWIN, or DDM). Detects reasoning shifts
  before accuracy drops. Configurable sample rate for amortized SHAP cost. No
  existing open-source library offers this capability.
- **Bincode serialization** -- `serde-bincode` feature flag now functional with
  `to_bincode`, `from_bincode`, `save_model_bincode`, `load_model_bincode` for
  compact binary model persistence (typically 3-5x smaller than JSON).

## [3.0.0] - 2026-03-11


### Added


- **TreeSHAP explanations** -- path-dependent TreeSHAP (Lundberg et al., 2020) for
  per-feature SHAP contributions. `SGBT::explain(features)` returns `ShapValues` with
  the invariant `base_value + sum(values) == predict(features)`. Named explanations
  via `explain_named()` when `feature_names` are configured.
- **StreamingShap** -- online running-mean |SHAP| tracker for real-time feature
  importance monitoring without storing past predictions.
- **Named features** -- `SGBTConfig::builder().feature_names(vec!["price", "volume"])`
  enables `named_feature_importances()` and `explain_named()`. Duplicate names are
  rejected at build time.
- **Multi-target regression** -- `MultiTargetSGBT` wraps T independent `SGBT<L>` models,
  one per target dimension. `train_one(features, targets)` and `predict(features) -> Vec<f64>`.
  Custom loss via `with_loss()`.
- **Drift detector state serialization** -- `DriftDetector::serialize_state()` and
  `restore_state()` preserve Page-Hinkley, ADWIN, and DDM internal state across
  save/load cycles. No more spurious drift after checkpoint restore.
- **PyO3 Python bindings** -- `irithyll-python` workspace crate providing
  `StreamingGBT`, `StreamingGBTConfig`, `ShapExplanation`, and `MultiTargetGBT` as
  Python classes. GIL-released train/predict, numpy zero-copy, JSON save/load.

### Performance


- **Vec leaf states** -- replaced `HashMap<u32, LeafState>` with `Vec<Option<LeafState>>`
  indexed by NodeId. Eliminates hashing overhead on the hot path for dense node indices.

## [2.0.0] - 2026-03-11


### Breaking Changes


- **Generic Loss parameter** -- `SGBT<L: Loss = SquaredLoss>` replaces the old boxed
  `Box<dyn Loss>` design. Loss gradient/hessian calls are now monomorphized and inlined
  by the compiler. Use `SGBT::with_loss(config, LogisticLoss)` instead of passing
  `Box::new()`. `DynSGBT = SGBT<Box<dyn Loss>>` is provided for dynamic dispatch.
  Cascades to `ParallelSGBT<L>`, `AsyncSGBT<L>`, `MulticlassSGBT<L>`, and `Predictor<L>`.
- **Observation trait** -- `train_one()` now accepts `&impl Observation` instead of
  `&Sample`. Zero-copy training via `SampleRef<'a>` (borrows `&[f64]`). Tuple impls
  for `(&[f64], f64)` and `(Vec<f64>, f64)` enable quick usage without constructing
  `Sample`. `train_one_slice()` is removed (subsumed by `Observation`).
- **Structured errors** -- `ConfigError` is now a sub-enum with `OutOfRange` and
  `Invalid` variants carrying `param`, `constraint`, and `value` fields. Replaces
  `InvalidConfig(String)`.
- **Auto LossType** -- `Loss` trait now requires `fn loss_type(&self) -> Option<LossType>`.
  `save_model()` auto-detects loss type from the model, no more manual `LossType` tag.
- **MulticlassSGBT::new() returns Result** -- no longer panics on `n_classes < 2`.

### Added


- **`SampleRef<'a>`** -- borrowed observation type for zero-allocation training.
- **`Observation` trait** -- unified interface with default `weight() -> 1.0`. Implemented
  for `Sample`, `SampleRef`, `(&[f64], f64)`, `(Vec<f64>, f64)`.
- **`Clone` for `SGBT<L>`** -- deep clone including all tree state, drift detectors, and
  leaf accumulators. Requires `L: Clone`. Also implemented for `ParallelSGBT<L>`.
- **`DriftDetector::clone_boxed()`** -- deep clone preserving internal state (unlike
  `clone_fresh()` which resets).
- **`BinnerKind` enum** -- replaces `Box<dyn BinningStrategy>` per-feature per-leaf with
  stack-allocated enum dispatch. Eliminates N_features heap allocations per new leaf.
- **`PartialEq` for `SGBTConfig`** and `DriftDetectorType`.
- **`Display` for `SGBTConfig`**, `SGBTVariant`, `DriftSignal`, `Sample`.
- **`From` conversions** -- `From<(Vec<f64>, f64)>` and `From<(&[f64], f64)>` for Sample.
- **`Debug` for `MulticlassSGBT`**.
- **Throughput benchmarks** -- `throughput_bench` and `scaling_bench` for regression
  tracking across configurations.

### Performance


- **Bitset feature mask** -- O(1) membership testing via `Vec<u64>` bitset, replacing
  O(n) `Vec::contains()`. The fallback fill loop in `generate_feature_mask()` drops
  from O(n^2) to O(n). Single `u64` for <=64 features.
- **Enum BinningStrategy dispatch** -- `BinnerKind` enum eliminates heap allocation and
  vtable indirection for histogram binners.
- **Pre-allocated train_counts buffer** -- `ParallelSGBT` reuses a buffer instead of
  allocating a fresh `Vec<usize>` per `train_one()` call.
- **Monomorphized loss** -- generic `L: Loss` parameter enables the compiler to inline
  gradient/hessian computations, eliminating vtable dispatch in the hot path.
- **Measured improvement** -- 8.5% throughput gain on 100-step 20-feature training
  workloads (Criterion benchmark).

## [1.0.0] - 2026-03-11


### Added


- **Streaming Adaptation** -- EWMA leaf decay (`leaf_half_life`), time-based proactive
  tree replacement (`max_tree_samples`), and EFDT-inspired split re-evaluation at
  max-depth leaves (`split_reeval_interval`). All three are opt-in via `SGBTConfigBuilder`
  and backward-compatible with existing serialized models (`#[serde(default)]`).
- **Lazy histogram decay** -- O(1) amortized forward decay for histogram bins. Tracks a
  scalar `decay_scale` per histogram and defers the O(n_bins) materialization pass to
  split evaluation time. Mathematically exact; automatic renormalization prevents f64
  underflow at ~16K samples per leaf.
- **SIMD histogram acceleration** (`simd` feature) -- AVX2 intrinsics for histogram
  gradient/hessian summation, gated behind the `simd` feature flag.
- **Neural leaf models** (`neural-leaves` feature) -- experimental MLP-based leaf
  predictions as an alternative to closed-form constant leaves.
- **Apache Arrow integration** (`arrow` feature) -- `train_from_record_batch()`,
  `predict_from_record_batch()`, `record_batch_to_samples()` for columnar data
  interop. Hardened NaN/Inf filtering.
- **Parquet I/O** (`parquet` feature) -- `train_from_parquet()` for bulk training
  directly from Parquet files.
- **ONNX export** (`onnx` feature) -- `export_onnx()` and `save_onnx()` for
  interoperability with ONNX-compatible runtimes.
- **Parallel training** (`parallel` feature) -- `ParallelSGBT` using Rayon for
  data-parallel tree training across boosting steps.
- **Model serialization** -- `save_model()` / `load_model()` for JSON checkpoint/restore
  with `ModelState` snapshots. `LossType` tags enable loss function reconstruction.
  Optional bincode backend via `serde-bincode` feature.
- **Deterministic seeding** -- `SGBTConfig::seed` ensures reproducible results across
  feature subsampling and variant stochastic decisions.
- **Feature importance** -- `FeatureImportance` tracker accumulates split gain per
  feature across the ensemble.
- **Online classification metrics** -- `ClassificationMetrics` with incremental accuracy,
  precision, recall, F1, and log loss. `MetricSet` combines regression and classification.
- **Async streaming** -- `AsyncSGBT` with tokio channels, bounded backpressure,
  `Predictor` for concurrent read-only access, and `PredictionStream` adapter.
- **Multi-class support** -- `MulticlassSGBT` with one-vs-rest committees and softmax
  normalization.
- **Three SGBT variants** -- Standard, Skip (SGBT-SK), and MultipleIterations (SGBT-MI)
  per Gunasekara et al. (2024).
- **Three drift detectors** -- Page-Hinkley Test, ADWIN, and DDM with configurable
  parameters via `DriftDetectorType`.
- **Pluggable loss functions** -- `SquaredLoss`, `LogisticLoss`, `SoftmaxLoss`,
  `HuberLoss`, and the `Loss` trait for custom implementations.
- **Three binning strategies** -- `UniformBinning`, `QuantileBinning`, and optional
  `KMeansBinning` for histogram construction.
- **Hoeffding tree splitting** -- statistically-grounded split decisions using
  histogram-binned gradient statistics with configurable confidence (`delta`).
- **XGBoost-style regularization** -- L2 (`lambda`) and minimum gain (`gamma`)
  regularization on leaf weights and split decisions.
- **Five examples** -- `basic_regression`, `classification`, `async_ingestion`,
  `custom_loss`, `drift_detection`, `model_checkpointing`, `streaming_metrics`.
- **Four Criterion benchmarks** -- `histogram_bench`, `tree_bench`, `ensemble_bench`,
  `parallel_bench`.
- **Property-based tests** -- proptest-based correctness verification.
- **GitHub Actions CI** -- test matrix across stable/beta/nightly Rust on Linux, macOS,
  and Windows. Includes clippy, rustfmt, doc generation, and MSRV verification.

## [0.1.0] - 2026-03-10


Initial development release. Core SGBT algorithm with Hoeffding trees, histogram
binning, drift detection, and online metrics.

[7.7.0]: https://github.com/evilrat420/irithyll/compare/v7.6.1...v7.7.0
[7.6.1]: https://github.com/evilrat420/irithyll/compare/v7.6.0...v7.6.1
[7.6.0]: https://github.com/evilrat420/irithyll/compare/v7.5.0...v7.6.0
[7.5.0]: https://github.com/evilrat420/irithyll/compare/v7.4.0...v7.5.0
[7.4.0]: https://github.com/evilrat420/irithyll/compare/v7.3.0...v7.4.0
[7.3.0]: https://github.com/evilrat420/irithyll/compare/v7.2.0...v7.3.0
[7.2.0]: https://github.com/evilrat420/irithyll/compare/v7.1.0...v7.2.0
[7.1.0]: https://github.com/evilrat420/irithyll/compare/v7.0.0...v7.1.0
[7.0.0]: https://github.com/evilrat420/irithyll/compare/v6.4.0...v7.0.0
[6.4.0]: https://github.com/evilrat420/irithyll/compare/v6.3.0...v6.4.0
[6.3.0]: https://github.com/evilrat420/irithyll/compare/v6.2.0...v6.3.0
[6.2.0]: https://github.com/evilrat420/irithyll/compare/v6.1.1...v6.2.0
[6.1.1]: https://github.com/evilrat420/irithyll/compare/v6.1.0...v6.1.1
[6.1.0]: https://github.com/evilrat420/irithyll/compare/v6.0.0...v6.1.0
[6.0.0]: https://github.com/evilrat420/irithyll/compare/v5.1.0...v6.0.0
[5.1.0]: https://github.com/evilrat420/irithyll/compare/v5.0.0...v5.1.0
[5.0.0]: https://github.com/evilrat420/irithyll/compare/v4.0.0...v5.0.0
[4.0.0]: https://github.com/evilrat420/irithyll/compare/v3.0.0...v4.0.0
[3.0.0]: https://github.com/evilrat420/irithyll/compare/v2.0.0...v3.0.0
[2.0.0]: https://github.com/evilrat420/irithyll/compare/v1.0.0...v2.0.0
[1.0.0]: https://github.com/evilrat420/irithyll/compare/v0.1.0...v1.0.0
[0.1.0]: https://github.com/evilrat420/irithyll/releases/tag/v0.1.0