linreg-core
A lightweight, dependency-free Ordinary Least Squares (OLS) linear regression library written in Rust. Compiles to WebAssembly for browser use or runs as a native Rust crate.
Key design principle: All linear algebra and statistical distribution functions are implemented from scratch - no external math libraries required. This keeps binary sizes small and makes the crate highly portable.
Features
- OLS Regression: Coefficients, standard errors, t-statistics, p-values, confidence intervals
- Model Statistics: R-squared, Adjusted R-squared, F-statistic, F-test p-value
- Linearity Tests:
- Rainbow Test
- Harvey-Collier Test
- Heteroscedasticity Tests:
- Breusch-Pagan Test (studentized/Koenker variant)
- White Test
- Normality Tests:
- Jarque-Bera Test
- Shapiro-Wilk Test (Royston's algorithm, n <= 5000)
- Anderson-Darling Test
- Autocorrelation:
- Durbin-Watson Test
- Influential Observations:
- Cook's Distance
- Multicollinearity:
- Variance Inflation Factor (VIF)
- Residual Analysis: Standardized residuals, leverage (hat matrix diagonal)
- Dual Target: Browser (WASM) and server (native Rust)
Quick Start
Native Rust
Add to your Cargo.toml:
[]
= { = "0.1", = false }
Use in your code:
use ols_regression;
WebAssembly (Browser)
Build with wasm-pack:
Use in JavaScript:
import init from './pkg/linreg_core.js';
;
Diagnostic Tests Example
use ols_regression;
use ;
Feature Flags
| Feature | Default | Description |
|---|---|---|
wasm |
Yes | Enables WASM bindings and browser support |
validation |
No | Includes test data for validation tests |
For native Rust without WASM overhead:
= { = "0.1", = false }
Validation
Results are validated against R (lmtest, car, skedastic, nortest) and Python (statsmodels, scipy). See the verification/ directory for test scripts and reference outputs.
Running Tests
# Unit tests
# All tests including doctests
Implementation Notes
Numerical Precision
- QR decomposition used throughout for numerical stability
- Anderson-Darling uses Abramowitz & Stegun 7.1.26 for normal CDF (differs from R's Cephes by ~1e-6)
- Shapiro-Wilk implements Royston's 1995 algorithm matching R's implementation
Known Limitations
- Harvey-Collier test may fail on high-VIF datasets (VIF > 5) due to numerical instability in recursive residuals
- Shapiro-Wilk limited to n <= 5000 (matching R's limitation)
- White test may differ from R on collinear datasets due to numerical precision in near-singular matrices
License
Dual-licensed under MIT or Apache-2.0.