inferust
Statistical modeling for Rust — a statsmodels-inspired library.
inferust fills the gap between Python's statsmodels / scipy.stats and the Rust ecosystem. It gives you regression summaries, hypothesis tests, descriptive stats, and correlation matrices with the same depth of output you'd expect from Python — p-values, confidence intervals, AIC/BIC, significance stars, and all.
Features
| Module | What you get | Python equivalent |
|---|---|---|
regression::Ols / Wls / Gls / Fgls |
OLS, weighted least squares, GLS with known covariance, and AR(1) feasible GLS with fast/stable solvers, robust/HAC SEs, confidence intervals, influence diagnostics, residual diagnostics, Durbin-Watson, Jarque-Bera, condition numbers, t/z stats, p-values, R², adj-R², F-stat, AIC, BIC | statsmodels.OLS().fit(), statsmodels.WLS().fit(), statsmodels.GLS().fit(), statsmodels.GLSAR() |
regression::RollingOls / RecursiveOls |
Rolling-window coefficient paths and recursive OLS with CUSUM stability diagnostics | statsmodels.regression.rolling.RollingOLS, statsmodels.regression.recursive_ls.RecursiveLS basics |
hypothesis::ttest |
One-sample, two-sample Welch, paired t-tests with 95% CI | scipy.stats.ttest_* |
hypothesis::chisq |
Goodness-of-fit and independence (contingency table) | scipy.stats.chisquare, chi2_contingency |
hypothesis::anova |
One-way ANOVA table (SS, MS, F, p) | scipy.stats.f_oneway |
descriptive::Summary |
mean, std, variance, min/max, quartiles, skewness, excess kurtosis | pd.Series.describe() |
data::DataFrame |
named numeric columns and formula-based OLS/WLS/logistic/Poisson fitting | statsmodels.formula.api basics |
glm::Logistic / Poisson |
binary logistic and Poisson count regression with MLE estimates, Wald inference, covariance, residual diagnostics, likelihood-ratio tests, prediction intervals, classification metrics, and post-estimation helpers | statsmodels.Logit().fit(), statsmodels.GLM(..., Poisson()).fit() |
discrete |
Probit, negative binomial, multinomial logit starters | statsmodels.discrete basics |
glm_family |
generic Gaussian/Binomial/Poisson GLM dispatch | statsmodels.GLM basics |
time_series |
AR, ARIMA, SARIMA/SARIMAX, VAR, VECM, VARMAX starters plus ACF, PACF, Ljung-Box, ADF, and KPSS diagnostics | statsmodels.tsa basics |
graphics |
dependency-light SVG line, scatter, residual, and ACF plots | statsmodels.graphics basics |
diagnostics |
VIF, Breusch-Pagan, White, RESET diagnostics | statsmodels.stats.diagnostic, outliers_influence basics |
evaluation |
regression/classification metrics, bootstrap mean intervals | common model-evaluation workflow |
robust |
Huber robust linear regression | statsmodels.RLM basics |
gee |
independence-working-correlation GEE starters | statsmodels.GEE basics |
mixed |
random-intercept mixed linear model starter | statsmodels.MixedLM basics |
correlation |
Pearson, Spearman, full correlation matrix | df.corr() |
Installation
Add to your Cargo.toml:
[]
= "0.1"
Quick start
OLS Regression
use Ols;
let x = vec!;
let y = vec!;
let result = new
.with_feature_names
.fit
.unwrap;
result.print_summary;
Output:
═══════════════════════════════════════════════════════════════════
OLS Regression Results
═══════════════════════════════════════════════════════════════════
Dep. variable: y Observations : 4
R² : 0.998102 Adj. R² : 0.994305
F-statistic : 262.7732 F p-value : 0.039405
AIC : 14.7316 BIC : 12.0167
───────────────────────────────────────────────────────────────────
Variable Coef Std Err t P>|t|
───────────────────────────────────────────────────────────────────
const -5.654762 5.033740 -1.1234 0.460565
hours_studied 4.130952 0.177951 23.2141 0.027430 *
prior_gpa 8.166667 1.490421 5.4793 0.115581
───────────────────────────────────────────────────────────────────
Significance codes: *** p<0.001 ** p<0.01 * p<0.05 . p<0.1
═══════════════════════════════════════════════════════════════════
The printed OLS/WLS summary also includes statsmodels-style residual diagnostics
out of the box: Durbin-Watson, Jarque-Bera with Prob(JB), residual skewness,
kurtosis, and the design-matrix condition number.
Formula-based fitting
use DataFrame;
let frame = new
.with_column.unwrap
.with_column.unwrap
.with_column.unwrap;
let result = frame.ols.unwrap;
Formula support includes numeric response ~ x1 + x2 terms, plus treatment dummy expansion for numeric-coded categorical columns via design_matrices_with_categorical and ols_with_categorical. Intercepts are handled by the model builders.
Weighted least squares
use Wls;
let weights = vec!;
let result = new
.with_feature_names
.fit
.unwrap;
result.print_summary;
GLS and rolling regression
use ;
let fgls = new
.with_feature_names
.fit
.unwrap;
let rolling = new.fit.unwrap;
let slopes = rolling.slopes;
Logistic regression
use Logistic;
let result = new
.with_feature_names
.fit
.unwrap;
let probabilities = result.predict_proba;
let intervals = result.confidence_intervals.unwrap;
let odds_ratios = result.odds_ratios;
let marginal_effects = result.average_marginal_effects;
let marginal_effect_table = result.average_marginal_effects_summary.unwrap;
let residuals = result.residuals;
let metrics = result.classification_metrics.unwrap;
let lr_test = result.likelihood_ratio_test.unwrap;
You can also use DataFrame::logistic("clicked ~ visits + age") for formula-based fitting. Logistic results expose fitted probabilities, covariance estimates, response/Pearson/deviance residuals, likelihood-ratio tests, classification metrics, and post-estimation helpers designed to mirror common statsmodels.Logit workflows.
Poisson regression
use Poisson;
let result = new
.with_feature_names
.fit
.unwrap;
let expected_counts = result.predict;
let intervals = result.confidence_intervals.unwrap;
let mean_intervals = result.fitted_mean_intervals.unwrap;
let residuals = result.residuals;
let incidence_rate_ratios = result.incidence_rate_ratios;
let lr_test = result.likelihood_ratio_test.unwrap;
Poisson results include covariance estimates, fitted values, response/Pearson/deviance residuals, log-likelihood, null log-likelihood, pseudo-R², deviance, null deviance, Pearson chi-square, AIC, BIC, likelihood-ratio tests, and response-scale mean intervals. DataFrame::poisson("count ~ exposure + age") provides formula-based fitting.
Hypothesis tests
use ;
// Paired t-test
let before = vec!;
let after = vec!;
paired.unwrap.print;
// Two-sample Welch t-test
two_sample.unwrap.print;
// One-way ANOVA
one_way.unwrap.print;
// Chi-squared goodness-of-fit
goodness_of_fit.unwrap.print;
// Chi-squared test of independence
independence.unwrap.print;
Descriptive statistics
use Summary;
let data = vec!;
new.unwrap.print;
// ─────────────────────────────
// n : 6
// mean : 6.400000
// std : 2.282176
// min : 3.600000
// 25% : 4.575000
// 50% : 6.150000
// 75% : 8.250000
// max : 9.300000
// skewness : -0.058732
// kurtosis : -1.504070
// ─────────────────────────────
Correlation
use correlation;
let r = pearson.unwrap;
let rs = spearman.unwrap;
let matrix = correlation_matrix.unwrap;
print_correlation_matrix;
Time series and graphics
use ;
use ;
let sarima = new.fit.unwrap;
let forecast = sarima.forecast.unwrap;
let acf_values = acf.unwrap;
let svg = acf_plot_svg.unwrap;
OLS builder options
use ;
let result = new // intercept on by default
.with_feature_names // label columns
.with_solver // default fast path
.with_covariance // robust standard errors
.fit
.unwrap;
let intervals = result.confidence_intervals.unwrap;
let influence = result.influence;
let diagnostics = result.diagnostics.unwrap;
let cooks_distance = influence.cooks_distance;
let durbin_watson = diagnostics.durbin_watson;
new
.stable // SVD solver for tougher designs
.robust // shorthand for HC1 covariance
.fit
.unwrap;
OlsResult also exposes .predict(&x) for out-of-sample predictions and all raw fields (coefficients, residuals, r_squared, p_values, etc.) for programmatic use.
Solver strategy
inferust defaults to a Cholesky solve of the normal system for full-rank, well-conditioned OLS problems. This avoids the extra work of forming a full inverse for coefficient estimation and is the fastest path for typical dense data.
For tougher or poorly conditioned designs, call .stable() or .with_solver(OlsSolver::Svd) to use the SVD path. For heteroskedasticity-consistent inference, use .with_covariance(OlsCovariance::Hc0), .Hc1, .Hc2, .Hc3, or the .robust() HC1 shorthand. The test suite includes statsmodels-derived reference values for coefficients, classical and robust standard errors, t/z statistics, p-values, confidence intervals, leverage, internally studentized residuals, Cook's distance, DFFITS, Durbin-Watson, Jarque-Bera, residual skew/kurtosis, condition number, R², F-statistics, AIC, and BIC.
Changelog
Release history is tracked in CHANGELOG.md, with an Unreleased section reserved for the next version before publication.
Benchmarks
The repository includes reproducible OLS benchmark scripts for comparing inferust with Python statsmodels on deterministic synthetic data. Build and run the Rust benchmark in release mode:
Additional examples:
Run the Python comparison after installing numpy, scipy, and statsmodels:
On the current local benchmark machine, the 10,000 row × 8 feature case measured approximately:
| Engine | Solver | Median fit time |
|---|---|---|
inferust |
Cholesky | 0.769 ms |
inferust |
SVD | 2.474 ms |
statsmodels |
default OLS | 2.492 ms |
Benchmark results vary by machine and BLAS/LAPACK configuration, so treat these as a local smoke test rather than a universal claim. The checksum printed by each script is useful for confirming both implementations fit equivalent data.
Error handling
All fallible functions return inferust::Result<T> (an alias for Result<T, InferustError>):
use InferustError;
match result
Dependencies
| Crate | Purpose |
|---|---|
nalgebra |
Matrix operations for OLS normal equations — no LAPACK required |
statrs |
Student's t, F, and χ² distributions for p-values and confidence intervals |
thiserror |
Ergonomic error types |
Roadmap
- Logistic regression (GLM with logit link)
- Ridge / Lasso regularization
- Durbin-Watson and Breusch-Pagan diagnostic tests
- Tukey HSD post-hoc test (after ANOVA)
- Time-series: ARIMA / ACF / PACF
- Weighted OLS
Contributions welcome — open an issue or PR!
License
MIT — see LICENSE.