Expand description
Imputation-like forensic baseline — answer “what would this dim have looked like if the point were normal?” by aggregating the per-dim distribution of the forest’s currently-held sample points.
Inspired by AWS’s ImputeVisitor but repurposed: instead of
imputing a NaN feature, this helper tells an SOC analyst how
far an observed point sits from the forest’s current idea of
“normal” on every dimension — the expected value under
normality plus a z-score-style delta.
§Semantics
expected[d]— mean of dimdacross every point currently held in any tree’s reservoir (the forest’s live baseline).stddev[d]— population standard deviation of the same set.observed[d]— the caller’s raw query value.delta[d] = observed[d] − expected[d].zscore[d] = delta[d] / stddev[d](clamped to0when the baseline stddev is zero on a dim — constant baseline means no meaningful z-score).live_points— number of unique points contributing to the baseline.
The baseline is computed in raw-point space: feature_scales
is applied to the stored points for averaging then inverted so
expected / stddev / delta live in the caller’s original
coordinate system. SOC dashboards don’t need to know about the
internal scaling.
Structs§
- Forensic
Baseline - Per-dim forensic baseline comparing an observed point against the forest’s current live sample distribution.