anofox-statistics
A statistical hypothesis testing library for Rust, validated against R (VALIDATION).
This library provides a wide range of statistical tests commonly used in data analysis, all validated against R's implementations to ensure numerical accuracy.
Features
-
Math Primitives
- Mean, variance, standard deviation, median
- Numerically stable mean and variance (Welford's algorithm)
- Trimmed mean (robust to outliers)
- Skewness and kurtosis (Fisher's definition, matching R's e1071)
-
Parametric Tests
- T-tests (Welch, Student, Paired) with all alternatives
- Yuen's test (robust t-test using trimmed means)
- Brown-Forsythe test (homogeneity of variances)
- One-way ANOVA (Fisher's and Welch's)
- Two-way ANOVA (factorial design with Type III SS)
- Repeated measures ANOVA (with Mauchly's sphericity test and GG/HF corrections)
-
Nonparametric Tests
- Ranking with average tie handling
- Mann-Whitney U test (Wilcoxon rank-sum)
- Wilcoxon signed-rank test (paired)
- Kruskal-Wallis test (k-sample)
- Brunner-Munzel test (robust rank-based test for stochastic equality)
-
Distributional Tests
- Shapiro-Wilk normality test (Royston AS R94)
- D'Agostino's K-squared test (omnibus normality test using skewness and kurtosis)
-
Correlation Analysis
- Pearson's product-moment correlation with CI
- Spearman's rank correlation
- Kendall's tau (tau-a, tau-b, tau-c variants)
- Partial and semi-partial correlation
- Distance correlation (detects non-linear dependence)
- Intraclass correlation coefficient (ICC, 6 variants)
-
Categorical Data Analysis
- Chi-square test (independence and goodness-of-fit)
- Fisher's exact test for 2x2 tables
- G-test (log-likelihood ratio)
- McNemar's test (standard and exact)
- Effect sizes: Cramér's V, phi coefficient, contingency coefficient
- Cohen's kappa (unweighted and weighted)
- Proportion tests (one-sample and two-sample)
- Exact binomial test
-
Resampling Methods
- Permutation engine with custom statistics
- Permutation t-test
- Stationary bootstrap (for dependent data)
- Circular block bootstrap
-
Modern Distribution Tests
- Energy distance test (univariate and multivariate)
- Maximum Mean Discrepancy (MMD) with multiple kernels (Gaussian, Linear, Polynomial, Laplacian)
-
Forecast Evaluation
- Diebold-Mariano test for comparing predictive accuracy
- Clark-West test for nested model comparison
- Superior Predictive Ability (SPA) test for multiple model comparison
- MSPE-Adjusted SPA test for multiple nested models (Clark-West + bootstrap)
- Model Confidence Set (Hansen, Lunde, & Nason, 2011)
-
Equivalence Testing (TOST)
- TOST for means: one-sample, two-sample, and paired t-tests
- TOST for correlations (Pearson and Spearman)
- TOST for proportions (one-sample and two-sample)
- Wilcoxon TOST (non-parametric equivalence)
- Bootstrap TOST (resampling-based)
- Yuen TOST (robust trimmed means)
Installation
Add to your Cargo.toml:
[]
= "0.2"
Examples
The library includes runnable examples demonstrating each major feature:
Quick Start
T-Tests
use ;
let group1 = vec!;
let group2 = vec!;
// Welch t-test (unequal variances), mu=0.0 tests if mean difference equals zero
let result = t_test
.expect;
println!;
println!;
println!;
// Student t-test (equal variances assumed)
let result = t_test?;
// Paired t-test
let result = t_test?;
// Test if mean difference equals 0.5 (non-zero null hypothesis)
let result = t_test?;
// T-test with 95% confidence interval
let result = t_test?;
if let Some = result.conf_int
Yuen's Robust T-Test
use ;
// 20% trimmed means (robust to outliers)
let result = yuen_test?;
println!;
println!;
Brown-Forsythe Test
use brown_forsythe;
let groups = vec!;
let result = brown_forsythe?;
println!;
println!;
One-Way ANOVA
use ;
let group1 = vec!;
let group2 = vec!;
let group3 = vec!;
let groups: = vec!;
// Fisher's ANOVA (assumes equal variances)
let result = one_way_anova?;
println!;
println!;
println!;
// Welch's ANOVA (robust to unequal variances)
let result = one_way_anova?;
Two-Way ANOVA
use two_way_anova;
// Values with factor level arrays (long format)
let values = vec!;
let factor_a = vec!; // 2 levels
let factor_b = vec!; // 2 levels
let result = two_way_anova?;
println!;
println!;
println!;
Repeated Measures ANOVA
use repeated_measures_anova;
// Matrix format: rows = subjects, columns = conditions
let subject1 = vec!;
let subject2 = vec!;
let subject3 = vec!;
let data: = vec!;
let result = repeated_measures_anova?; // compute sphericity
println!;
println!;
println!;
// Sphericity test (Mauchly's W) - only for k >= 3 conditions
if let Some = &result.sphericity
// Greenhouse-Geisser corrected p-value
if let Some = &result.greenhouse_geisser
Nonparametric Tests
use ;
// Ranking
let data = vec!;
let ranks = rank?;
// Mann-Whitney U test (two-sided, no continuity correction, normal approximation)
let result = mann_whitney_u?;
// With exact p-values (for small samples without ties)
let result = mann_whitney_u?;
// With confidence interval (Hodges-Lehmann estimate)
let result = mann_whitney_u?;
if let Some = result.conf_int
// Test if location shift equals 0.5 (non-zero null hypothesis)
let result = mann_whitney_u?;
// Wilcoxon signed-rank test (paired)
let result = wilcoxon_signed_rank?;
// Wilcoxon with non-zero null hypothesis (test if median difference equals 0.5)
let result = wilcoxon_signed_rank?;
// Kruskal-Wallis test
let result = kruskal_wallis?;
// Brunner-Munzel test (robust alternative to Mann-Whitney)
let result = brunner_munzel?;
println!;
// Brunner-Munzel with 95% confidence interval
let result = brunner_munzel?;
if let Some = result.conf_int
Normality Tests
use ;
let data = vec!;
// Shapiro-Wilk test
let result = shapiro_wilk?;
println!;
println!;
// D'Agostino's K-squared test (omnibus test using skewness and kurtosis)
let result = dagostino_k_squared?;
println!;
println!;
Resampling Methods
use ;
// Permutation t-test
let result = permutation_t_test?;
println!;
// Stationary bootstrap for time series
let bootstrap = new?;
let samples: = bootstrap.take.collect;
// Circular block bootstrap
let bootstrap = new?;
Modern Distribution Tests
use ;
// Energy distance test
let result = energy_distance_test?;
println!;
println!;
// Maximum Mean Discrepancy with Gaussian kernel
let result = mmd_test?;
println!;
println!;
// MMD with automatic bandwidth selection
let result = mmd_test?;
Correlation Analysis
use ;
let x = vec!;
let y = vec!;
// Pearson correlation with 95% CI
let result = pearson?;
println!;
// Spearman rank correlation
let result = spearman?;
println!;
// Kendall's tau-b (default, matches R)
let result = kendall?;
println!;
// Partial correlation (controlling for z)
let z = vec!;
let result = partial_cor?;
println!;
// Distance correlation (detects non-linear dependence)
let result = distance_cor?;
println!;
// ICC for inter-rater reliability
let ratings = vec!;
let result = icc?;
println!;
Categorical Data Analysis
use ;
// Chi-square test of independence
let observed = vec!;
let result = chisq_test?;
println!;
// Chi-square goodness-of-fit (test if die is fair)
let rolls = vec!;
let result = chisq_goodness_of_fit?;
println!;
// Fisher's exact test for 2x2 tables
let table = ;
let result = fisher_exact?;
println!;
// McNemar's test for paired data
let before_after = ;
let result = mcnemar_test?;
println!;
// Effect sizes
let result = cramers_v?;
println!;
let result = phi_coefficient?;
println!;
// Cohen's kappa for inter-rater agreement
let confusion = vec!;
let result = cohen_kappa?;
println!;
// Exact binomial test
let result = binom_test?;
println!;
Forecast Evaluation
use ;
// Forecast errors from two competing models
let errors_model1 = vec!;
let errors_model2 = vec!;
// Diebold-Mariano test (two-sided)
let result = diebold_mariano?;
println!;
println!;
// Clark-West test for nested models (e.g., AR(1) vs AR(2))
let restricted_errors = vec!; // Benchmark/restricted model
let unrestricted_errors = vec!; // Alternative/unrestricted model
let result = clark_west?;
println!;
println!;
// Superior Predictive Ability test (compare benchmark vs multiple models)
let benchmark_losses = vec!;
let model_losses = vec!;
let result = spa_test?;
println!;
println!;
// MSPE-Adjusted SPA for multiple nested models
// Combines Clark-West adjustment with bootstrap for multiple testing
let benchmark_errors = vec!;
let nested_model_errors = vec!;
let result = mspe_adjusted_spa?;
println!;
println!;
// Model Confidence Set - identify the set of best models
let losses = vec!;
let result = model_confidence_set?;
println!;
println!;
Equivalence Testing (TOST)
TOST (Two One-Sided Tests) tests whether an effect is small enough to be considered practically equivalent to zero, rather than just testing if it differs from zero.
use ;
let group1 = vec!;
let group2 = vec!;
// Two-sample TOST: test if mean difference is within ±0.5
let bounds = Symmetric ;
let result = tost_t_test_two_sample?;
println!;
println!;
println!;
// Using Cohen's d effect size bounds (±0.5 SD)
let bounds = CohenD ;
let result = tost_t_test_two_sample?;
// Correlation TOST: test if correlation is equivalent to zero
let x = vec!;
let y = vec!; // Near-zero correlation
let bounds = Symmetric ;
let result = tost_correlation?;
// Robust TOST using trimmed means (resistant to outliers)
let data_with_outlier = vec!; // Outlier
let normal_data = vec!;
let bounds = Symmetric ;
let result = tost_yuen?; // 20% trim
Validation
This library is developed using Test-Driven Development (TDD) with R as the oracle (ground truth). All implementations are validated against R's statistical functions:
| Rust Function | R Equivalent | Package |
|---|---|---|
t_test() |
t.test() |
stats |
yuen_test() |
yuen() |
WRS2 |
brown_forsythe() |
leveneTest(center=median) |
car |
one_way_anova() |
oneway.test(), aov() |
stats |
two_way_anova() |
Anova(type="III") |
car |
repeated_measures_anova() |
ezANOVA() |
ez |
mann_whitney_u(), wilcoxon_signed_rank() |
wilcox.test() |
stats |
kruskal_wallis() |
kruskal.test() |
stats |
brunner_munzel() |
brunner.munzel.test() |
lawstat |
shapiro_wilk() |
shapiro.test() |
stats |
dagostino_k_squared() |
agostino.test(), anscombe.test() |
moments |
skewness(), kurtosis() |
skewness(), kurtosis() |
e1071 |
diebold_mariano() |
dm.test() |
forecast |
pearson(), spearman() |
cor.test() |
stats |
kendall() |
cor.test(method="kendall") |
stats |
partial_cor(), semi_partial_cor() |
pcor.test(), spcor.test() |
ppcor |
distance_cor() |
dcor() |
energy |
icc() |
ICC() |
psych |
chisq_test() |
chisq.test() |
stats |
chisq_goodness_of_fit() |
chisq.test(p=...) |
stats |
fisher_exact() |
fisher.test() |
stats |
g_test() |
GTest() |
DescTools |
mcnemar_test() |
mcnemar.test() |
stats |
cramers_v() |
CramerV() |
DescTools |
phi_coefficient() |
phi() |
psych |
cohen_kappa() |
cohen.kappa() |
psych |
binom_test() |
binom.test() |
stats |
prop_test_one(), prop_test_two() |
prop.test() |
stats |
tost_t_test_*() |
TOSTone(), TOSTtwo(), TOSTpaired() |
TOSTER |
tost_correlation() |
TOSTr() |
TOSTER |
tost_prop_*() |
TOSTtwo.prop() |
TOSTER |
tost_wilcoxon_*() |
wilcox_TOST() |
TOSTER |
tost_bootstrap() |
boot_t_TOST() |
TOSTER |
tost_yuen() |
yuen.TOST() |
WRS2 |
All 303 test cases ensure numerical agreement with R within appropriate tolerances (typically 1e-10, with documented exceptions for algorithm-dependent tests like Shapiro-Wilk).
For complete transparency on the validation process, see R/VALIDATION.md, which documents:
- All 76 reference data files and their R generation code
- Tolerance rationale for each test category
- Step-by-step reproduction instructions
- R package dependencies
Dependencies
- statrs - Statistical distributions
- thiserror - Error handling
- rand - Random number generation for resampling
Attribution
This library incorporates Rust implementations of algorithms from several open-source projects. See THIRD_PARTY_NOTICES.md for complete attribution and license information.
- statrs (MIT) - Statistical distributions
- rand (MIT/Apache-2.0) - Random number generation
- R Statistical Computing - Algorithm validation and methodology
License
MIT License