Module criterion_stats::univariate::outliers::tukey
source · Expand description
Tukey’s method
The original method uses two “fences” to classify the data. All the observations “inside” the fences are considered “normal”, and the rest are considered outliers.
The fences are computed from the quartiles of the sample, according to the following formula:
ⓘ
// q1, q3 are the first and third quartiles
let iqr = q3 - q1; // The interquartile range
let (f1, f2) = (q1 - 1.5 * iqr, q3 + 1.5 * iqr); // the "fences"
let is_outlier = |x| if x > f1 && x < f2 { true } else { false };
The classifier provided here adds two extra outer fences:
ⓘ
let (f3, f4) = (q1 - 3 * iqr, q3 + 3 * iqr); // the outer "fences"
The extra fences add a sense of “severity” to the classification. Data points outside of the outer fences are considered “severe” outliers, whereas points outside the inner fences are just “mild” outliers, and, as the original method, everything inside the inner fences is considered “normal” data.
Some ASCII art for the visually oriented people:
ⓘ
LOW-ish NORMAL-ish HIGH-ish
x | + | o o o o o o o | + | x
f3 f1 f2 f4
Legend:
o: "normal" data (not an outlier)
+: "mild" outlier
x: "severe" outlier
Structs
Iterator over the labeled data
A classified/labeled sample.
Enums
Labels used to classify outliers
Functions
Classifies the sample, and returns a labeled sample.