linreg-core
A lightweight, self-contained linear regression library written in Rust. Compiles to WebAssembly for browser use or runs as a native Rust crate.
Key design principle: All linear algebra and statistical distribution functions are implemented from scratch — no external math libraries required. This keeps binary sizes small and makes the crate highly portable.
Features
Regression Methods
- OLS Regression: Coefficients, standard errors, t-statistics, p-values, confidence intervals
- Ridge Regression: L2-regularized regression with optional standardization
- Lasso Regression: L1-regularized regression via coordinate descent
- Lambda Path Generation: Create regularization paths for cross-validation
Model Statistics
- R-squared, Adjusted R-squared, F-statistic, F-test p-value
- Residuals, fitted values, leverage (hat matrix diagonal)
- Mean Squared Error (MSE)
- Variance Inflation Factor (VIF) for multicollinearity detection
Diagnostic Tests
| Category | Tests |
|---|---|
| Linearity | Rainbow Test, Harvey-Collier Test |
| Heteroscedasticity | Breusch-Pagan (Koenker variant), White Test (R & Python methods) |
| Normality | Jarque-Bera, Shapiro-Wilk (n ≤ 5000), Anderson-Darling |
| Autocorrelation | Durbin-Watson |
| Influence | Cook's Distance |
Dual Target
- Browser (WASM) and server (native Rust)
- Optional domain restriction for WASM builds
Quick Start
Native Rust
Add to your Cargo.toml:
[]
= { = "0.2", = false }
OLS Regression
use ols_regression;
Ridge Regression
use ;
use Matrix;
Lasso Regression
use ;
use Matrix;
WebAssembly (Browser)
Build with wasm-pack:
OLS in JavaScript
import init from './pkg/linreg_core.js';
;
Ridge Regression in JavaScript
const result = JSON.;
console.log;
Lasso Regression in JavaScript
const result = JSON.;
console.log;
console.log;
Lambda Path Generation
const path = JSON.;
console.log;
console.log;
Diagnostic Tests
Native Rust
use ;
WebAssembly
All diagnostic tests are available in WASM:
// Rainbow test
const rainbow = JSON.;
// Harvey-Collier test
const hc = JSON.;
// Breusch-Pagan test
const bp = JSON.;
// White test (method selection)
const white = JSON.;
// Jarque-Bera test
const jb = JSON.;
// Durbin-Watson test
const dw = JSON.;
// Shapiro-Wilk test
const sw = JSON.;
// Anderson-Darling test
const ad = JSON.;
// Cook's Distance
const cd = JSON.;
Statistical Utilities (WASM)
// Student's t CDF: P(T <= t)
const tCDF = ;
// Critical t-value for two-tailed test
const tCrit = ;
// Normal inverse CDF (probit)
const zScore = ;
Feature Flags
| Feature | Default | Description |
|---|---|---|
wasm |
Yes | Enables WASM bindings and browser support |
validation |
No | Includes test data for validation tests |
For native Rust without WASM overhead:
= { = "0.2", = false }
Regularization Path
Generate a sequence of lambda values for regularization path analysis:
use ;
use Matrix;
// Assume x is your standardized design matrix and y is centered
let x = new;
let y = vec!;
let options = LambdaPathOptions ;
let lambdas = make_lambda_path;
// Use each lambda for cross-validation or plotting regularization paths
for &lambda in lambdas.iter
Domain Security (WASM)
Optional domain restriction via build-time environment variable:
LINREG_DOMAIN_RESTRICT=example.com,mysite.com
When NOT set (default), all domains are allowed. When set, only the specified domains can use the WASM module.
Validation
Results are validated against R (lmtest, car, skedastic, nortest, glmnet) and Python (statsmodels, scipy, sklearn). See the verification/ directory for test scripts and reference outputs.
Running Tests
# Unit tests
# WASM tests
# All tests including doctests
Implementation Notes
Regularization
The Ridge and Lasso implementations follow the glmnet formulation:
minimize (1/(2n)) * Σ(yᵢ - β₀ - xᵢᵀβ)² + λ * [(1 - α) * ||β||₂² / 2 + α * ||β||₁]
- Ridge (α = 0): Closed-form solution with (X'X + λI)⁻¹X'y
- Lasso (α = 1): Coordinate descent algorithm
Numerical Precision
- QR decomposition used throughout for numerical stability
- Anderson-Darling uses Abramowitz & Stegun 7.1.26 for normal CDF (differs from R's Cephes by ~1e-6)
- Shapiro-Wilk implements Royston's 1995 algorithm matching R's implementation
Known Limitations
- Harvey-Collier test may fail on high-VIF datasets (VIF > 5) due to numerical instability in recursive residuals
- Shapiro-Wilk limited to n <= 5000 (matching R's limitation)
- White test may differ from R on collinear datasets due to numerical precision in near-singular matrices
Disclaimer
This library is under active development and has not reached 1.0 stability. While outputs are validated against R and Python implementations, do not use this library for critical applications (medical, financial, safety-critical systems) without independent verification. See the LICENSE for full terms. The software is provided "as is" without warranty of any kind.
License
Dual-licensed under MIT or Apache-2.0.