Module criterion_stats::univariate::outliers::tukey [] [src]

Tukey's method

The original method uses two "fences" to classify the data. All the observations "inside" the fences are considered "normal", and the rest are considered outliers.

The fences are computed from the quartiles of the sample, according to the following formula:

Be careful when using this code, it's not being tested!
// q1, q3 are the first and third quartiles
let iqr = q3 - q1;  // The interquartile range
let (f1, f2) = (q1 - 1.5 * iqr, q3 + 1.5 * iqr);  // the "fences"

let is_outlier = |x| if x > f1 && x < f2 { true } else { false };

The classifier provided here adds two extra outer fences:

Be careful when using this code, it's not being tested!
let (f3, f4) = (q1 - 3 * iqr, q3 + 3 * iqr);  // the outer "fences"

The extra fences add a sense of "severity" to the classification. Data points outside of the outer fences are considered "severe" outliers, whereas points outside the inner fences are just "mild" outliers, and, as the original method, everything inside the inner fences is considered "normal" data.

Some ASCII art for the visually oriented people:

Be careful when using this code, it's not being tested!
LOW-ish                NORMAL-ish                 HIGH-ish
        x   |       +    |  o o  o    o   o o  o  |        +   |   x
            f3           f1                       f2           f4

Legend:
o: "normal" data (not an outlier)
+: "mild" outlier
x: "severe" outlier

Structs

Iter

Iterator over the labeled data

LabeledSample

A classified/labeled sample.

Enums

Label

Labels used to classify outliers

Functions

classify

Classifies the sample, and returns a labeled sample.