Module stat_tests

Expand description

Collection of statistical tests

Provides a number of statistical tests that can be used to compare samples:

Kolmogorov-Smirnov test
Kuiper’s test
Anderson-Darling test

Each of this tests, checks a null hypothesis that both samples are drawn from the same distribution. After the test is performed we can compare the p-value against some error threshold e.g. 5% :

let s1 : Vec<f64> = (0..100).map(|_| rand::thread_rng().gen()).collect();
let s2 : Vec<f64> = (0..70).map(|_| rand::thread_rng().gen()).collect();

let test_result = ks2_test(s1, s2)?;

// For the test to reject null hypothesis p_value must be below the threshold
assert!(test_result.p_value() > 0.05);

This basically tells that the value of the test statistic is within 95% probability range for the distribution which assumes the null hypothesis that the samples are from the same distribution. So if we were to run the same test multiple times, we would find that in 95% cases the test would not reject the null hypothesis and 5% of times it would wrongly reject it, creating so-called Type 1 error.

However, what we are interested in is the frequency of the Type 2 errors. They happen when the test does not reject the null hypothesis despite samples being drawn from different distributions. However, there is no analytical formula for this, sine the error frequency depends on the difference between the distributions and size of the samples. The parameter that describes the frequency of Type 2 errors is called Power of the test and has to be established by numerical experiments.

Structs§

Ecdf: Empirical cumulative distribution function
EcdfIterator: Iterator over ecdf
TestResult: Represents a result of a statistical test

Enums§

TestError: Error that can be raised by a statistical test

Functions§

ad2_test: Perform Anderson-Darling two sample test
ks1_test: Perform one sample Kolmogorov-Smirnov statistical test
ks2_test: Preform two sample Kolmogorov-Smirnov test
kuiper2_test: Preform 2-sample Kuiper’s test