Expand description
§linreg-core
A lightweight, self-contained linear regression library in pure Rust.
No external math dependencies. All linear algebra (matrices, QR decomposition) and statistical functions (distributions, hypothesis tests) are implemented from scratch. Compiles to WebAssembly for browser use, exposes Python bindings via PyO3, or runs as a native Rust crate.
§What This Does
- OLS Regression — Ordinary Least Squares with numerically stable QR decomposition
- Regularized Regression — Ridge, Lasso, and Elastic Net via coordinate descent
- WLS Regression — Weighted Least Squares for heteroscedastic data
- LOESS — Non-parametric locally weighted smoothing
- K-Fold Cross Validation — Model evaluation for all regression types
- Prediction Intervals — Point and interval predictions for all model types
- Diagnostic Tests — 14 statistical tests for validating regression assumptions
- Feature Importance — Standardized coefficients, SHAP, permutation importance, VIF ranking
- Model Serialization — Save/load trained models to JSON
- WASM Support — Same API works in browsers via WebAssembly
- Python Bindings — PyO3 bindings available via
pip install linreg-core
§Quick Start
§Native Rust
Add to Cargo.toml (no WASM overhead):
[dependencies]
linreg-core = { version = "0.8", default-features = false }use linreg_core::core::ols_regression;
let y = vec![2.5, 3.7, 4.2, 5.1, 6.3];
let x1 = vec![1.0, 2.0, 3.0, 4.0, 5.0];
let x2 = vec![2.0, 4.0, 5.0, 4.0, 3.0];
let names = vec!["Intercept".into(), "Temp".into(), "Pressure".into()];
let result = ols_regression(&y, &[x1, x2], &names)?;
println!("R²: {}", result.r_squared);
println!("F-statistic: {}", result.f_statistic);
println!("AIC: {}", result.aic);§WebAssembly (JavaScript)
[dependencies]
linreg-core = "0.8"Build with wasm-pack build --target web, then use in JavaScript:
import init, { ols_regression } from './linreg_core.js';
await init();
const result = JSON.parse(ols_regression(
JSON.stringify([2.5, 3.7, 4.2, 5.1, 6.3]),
JSON.stringify([[1,2,3,4,5], [2,4,5,4,3]]),
JSON.stringify(["Intercept", "X1", "X2"])
));
console.log("R²:", result.r_squared);§Regularized Regression
use linreg_core::regularized::{ridge_fit, RidgeFitOptions, lasso_fit, LassoFitOptions};
use linreg_core::linalg::Matrix;
let x = Matrix::new(100, 3, vec![0.0; 300]);
let y = vec![0.0; 100];
// Ridge regression (L2 penalty - shrinks coefficients, handles multicollinearity)
let ridge_result = ridge_fit(&x, &y, &RidgeFitOptions {
lambda: 1.0,
intercept: true,
standardize: true,
..Default::default()
})?;
// Lasso regression (L1 penalty — automatic variable selection by zeroing coefficients)
let lasso_result = lasso_fit(&x, &y, &LassoFitOptions {
lambda: 0.1,
intercept: true,
standardize: true,
..Default::default()
})?;§WLS and LOESS
use linreg_core::weighted_regression::wls_regression;
use linreg_core::loess::{loess_fit, LoessOptions};
// Weighted Least Squares — down-weight high-variance observations
let weights = vec![1.0, 2.0, 1.0, 2.0, 1.0];
let wls = wls_regression(
&[2.5, 3.7, 4.2, 5.1, 6.3],
&[vec![1.0, 2.0, 3.0, 4.0, 5.0]],
&weights,
)?;
// LOESS — non-parametric smoothing (single predictor)
let x = vec![0.0, 1.0, 2.0, 3.0, 4.0, 5.0];
let y = vec![1.0, 2.1, 3.9, 8.2, 16.5, 32.1];
let loess = loess_fit(&y, &[x], &LoessOptions::default())?;§K-Fold Cross Validation
use linreg_core::cross_validation::{kfold_cv_ols, KFoldOptions};
let y = vec![2.5, 3.7, 4.2, 5.1, 6.3, 7.0, 7.5, 8.1];
let x1 = vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0];
let names = vec!["Intercept".to_string(), "X1".to_string()];
let cv = kfold_cv_ols(&y, &[x1], &names, &KFoldOptions {
n_folds: 5,
shuffle: true,
seed: Some(42),
})?;
println!("CV RMSE: {:.4} ± {:.4}", cv.mean_rmse, cv.std_rmse);
println!("CV R²: {:.4} ± {:.4}", cv.mean_r_squared, cv.std_r_squared);§Diagnostic Tests
After fitting a model, validate its assumptions:
| Test | Tests For | Use When |
|---|---|---|
diagnostics::rainbow_test | Linearity | Checking if relationships are linear |
diagnostics::harvey_collier_test | Functional form | Suspecting model misspecification |
diagnostics::reset_test | Specification error | Detecting omitted variables or wrong functional form |
diagnostics::breusch_pagan_test | Heteroscedasticity | Variance changes with predictors |
diagnostics::white_test | Heteroscedasticity | More general than Breusch-Pagan |
diagnostics::shapiro_wilk_test | Normality | Small to moderate samples (n ≤ 5000) |
diagnostics::jarque_bera_test | Normality | Large samples, skewness/kurtosis |
diagnostics::anderson_darling_test | Normality | Tail-sensitive, any sample size |
diagnostics::durbin_watson_test | Autocorrelation | Time series or ordered data |
diagnostics::breusch_godfrey_test | Higher-order autocorrelation | Detecting serial correlation at multiple lags |
diagnostics::cooks_distance_test | Influential points | Identifying high-impact observations |
diagnostics::dfbetas_test | Coefficient influence | Which observations drive each coefficient |
diagnostics::dffits_test | Fitted value influence | Influence of each observation on its own prediction |
diagnostics::vif_test | Multicollinearity | Detecting highly correlated predictors |
use linreg_core::diagnostics::{rainbow_test, breusch_pagan_test, RainbowMethod};
// Rainbow test for linearity
let rainbow = rainbow_test(&y, &[x1.clone(), x2.clone()], 0.5, RainbowMethod::R)?;
if rainbow.r_result.as_ref().map_or(false, |r| r.p_value < 0.05) {
println!("Warning: relationship may be non-linear");
}
// Breusch-Pagan test for heteroscedasticity
let bp = breusch_pagan_test(&y, &[x1, x2])?;
if bp.p_value < 0.05 {
println!("Warning: residuals have non-constant variance");
}§Feature Flags
| Flag | Default | Description |
|---|---|---|
wasm | Yes | Enables WASM bindings and browser support |
python | No | Enables Python bindings via PyO3 (built with maturin) |
validation | No | Includes test data for validation tests |
For native-only builds (smaller binary, no WASM deps):
linreg-core = { version = "0.8", default-features = false }§Why This Library?
- Zero dependencies — No
nalgebra, nostatrs, nondarray. Pure Rust. - Validated — Outputs match R’s
lm(),glmnet, and Python’sstatsmodels - WASM-ready — Same code runs natively and in browsers
- Python-ready — PyO3 bindings expose the full API to Python
- Permissive license — MIT OR Apache-2.0
§Module Structure
core— OLS regression, coefficients, residuals, VIF, AIC/BICregularized— Ridge, Lasso, Elastic Net, regularization pathspolynomial— Polynomial regression of any degree with centering/standardizationweighted_regression— Weighted Least Squares (WLS)loess— Locally weighted scatterplot smoothingcross_validation— K-Fold Cross Validation for all regression typesprediction_intervals— Prediction and confidence intervals for all model typesfeature_importance— Standardized coefficients, SHAP, permutation importance, VIF rankingdiagnostics— 14 statistical tests (linearity, heteroscedasticity, normality, autocorrelation, influence)serialization— Model save/load to JSON (native Rust)stats— Descriptive statistics utilitiesdistributions— Statistical distributions (t, F, χ², normal, beta, gamma)linalg— Matrix operations, QR decomposition, linear system solvererror— Error types and Result alias
§Links
§Disclaimer
This library is under active development and has not reached 1.0 stability. While outputs are validated against R and Python implementations, do not use this library for critical applications (medical, financial, safety-critical systems) without independent verification. See the LICENSE for full terms. The software is provided “as is” without warranty of any kind.
Re-exports§
pub use core::aic;pub use core::aic_python;pub use core::bic;pub use core::bic_python;pub use core::log_likelihood;pub use core::RegressionOutput;pub use core::VifResult;pub use prediction_intervals::compute_from_fit;pub use prediction_intervals::elastic_net_prediction_intervals;pub use prediction_intervals::lasso_prediction_intervals;pub use prediction_intervals::prediction_intervals;pub use prediction_intervals::ridge_prediction_intervals;pub use prediction_intervals::PredictionIntervalOutput;pub use diagnostics::BGTestType;pub use diagnostics::BreuschGodfreyResult;pub use diagnostics::CooksDistanceResult;pub use diagnostics::DiagnosticTestResult;pub use diagnostics::RainbowMethod;pub use diagnostics::RainbowSingleResult;pub use diagnostics::RainbowTestOutput;pub use diagnostics::ResetType;pub use diagnostics::WhiteMethod;pub use diagnostics::WhiteSingleResult;pub use diagnostics::WhiteTestOutput;pub use cross_validation::CVResult;pub use cross_validation::FoldResult;pub use cross_validation::KFoldOptions;pub use cross_validation::kfold_cv_elastic_net;pub use cross_validation::kfold_cv_lasso;pub use cross_validation::kfold_cv_ols;pub use cross_validation::kfold_cv_ridge;pub use loess::loess_fit;pub use loess::LoessFit;pub use loess::LoessOptions;pub use polynomial::polynomial_regression;pub use polynomial::predict as polynomial_predict;pub use polynomial::predict as polynomial_predict;pub use polynomial::PolynomialFit;pub use polynomial::PolynomialOptions;pub use weighted_regression::wls_regression;pub use weighted_regression::WlsFit;pub use feature_importance::PermutationImportanceOptions;pub use feature_importance::PermutationImportanceOutput;pub use feature_importance::ShapOutput;pub use feature_importance::StandardizedCoefficientsOutput;pub use feature_importance::VifRankingOutput;pub use feature_importance::permutation_importance_elastic_net;pub use feature_importance::permutation_importance_lasso;pub use feature_importance::permutation_importance_loess;pub use feature_importance::permutation_importance_ols;pub use feature_importance::permutation_importance_ols_named;pub use feature_importance::permutation_importance_ridge;pub use feature_importance::shap_values_elastic_net;pub use feature_importance::shap_values_lasso;pub use feature_importance::shap_values_linear;pub use feature_importance::shap_values_linear_named;pub use feature_importance::shap_values_polynomial;pub use feature_importance::shap_values_ridge;pub use feature_importance::standardized_coefficients;pub use feature_importance::standardized_coefficients_named;pub use feature_importance::vif_ranking;pub use diagnostics::rainbow_test as rainbow_test_core;pub use diagnostics::white_test as white_test_core;pub use error::error_json;pub use error::error_to_json;pub use error::Error;pub use error::Result;pub use stats::correlation;pub use stats::max;pub use stats::mean;pub use stats::median;pub use stats::min;pub use stats::mode;pub use stats::quantile;pub use stats::range;pub use stats::stddev;pub use stats::sum;pub use stats::variance;pub use stats::FiveNumberSummary;pub use stats::ModeResult;
Modules§
- core
- Core OLS regression implementation.
- cross_
validation - K-Fold Cross Validation for linear regression models.
- diagnostics
- Statistical diagnostic tests for linear regression assumptions.
- distributions
- Custom statistical special functions and distribution utilities (CDF/SF/quantiles),
primarily to avoid pulling in
statrsfor regression diagnostics. - error
- Error types for the linear regression library.
- feature_
importance - Feature importance metrics for regression models.
- linalg
- Minimal Linear Algebra module to replace nalgebra dependency.
- loess
- LOESS (Locally Estimated Scatterplot Smoothing)
- polynomial
- Polynomial Regression
- prediction_
intervals - Prediction Intervals Module
- regularized
- Ridge and Lasso regression (glmnet-compatible implementations).
- serialization
- Model serialization module for saving and loading regression models.
- stats
- Basic statistical utility functions.
- wasm
- WASM-specific bindings for linreg-core
- weighted_
regression - Weighted regression methods
Macros§
- impl_
serialization - Macro to generate ModelSave and ModelLoad implementations for a model type.