Skip to main content

Crate linreg_core

Crate linreg_core 

Source
Expand description

§linreg-core

A lightweight, self-contained linear regression library in pure Rust.

No external math dependencies. All linear algebra (matrices, QR decomposition) and statistical functions (distributions, hypothesis tests) are implemented from scratch. Compiles to WebAssembly for browser use, exposes Python bindings via PyO3, or runs as a native Rust crate.

Live Demo →

§What This Does

  • OLS Regression — Ordinary Least Squares with numerically stable QR decomposition
  • Regularized Regression — Ridge, Lasso, and Elastic Net via coordinate descent
  • WLS Regression — Weighted Least Squares for heteroscedastic data
  • LOESS — Non-parametric locally weighted smoothing
  • K-Fold Cross Validation — Model evaluation for all regression types
  • Prediction Intervals — Point and interval predictions for all model types
  • Diagnostic Tests — 14 statistical tests for validating regression assumptions
  • Feature Importance — Standardized coefficients, SHAP, permutation importance, VIF ranking
  • Model Serialization — Save/load trained models to JSON
  • WASM Support — Same API works in browsers via WebAssembly
  • Python Bindings — PyO3 bindings available via pip install linreg-core

§Quick Start

§Native Rust

Add to Cargo.toml (no WASM overhead):

[dependencies]
linreg-core = { version = "0.8", default-features = false }
use linreg_core::core::ols_regression;

let y = vec![2.5, 3.7, 4.2, 5.1, 6.3];
let x1 = vec![1.0, 2.0, 3.0, 4.0, 5.0];
let x2 = vec![2.0, 4.0, 5.0, 4.0, 3.0];
let names = vec!["Intercept".into(), "Temp".into(), "Pressure".into()];

let result = ols_regression(&y, &[x1, x2], &names)?;
println!("R²: {}", result.r_squared);
println!("F-statistic: {}", result.f_statistic);
println!("AIC: {}", result.aic);

§WebAssembly (JavaScript)

[dependencies]
linreg-core = "0.8"

Build with wasm-pack build --target web, then use in JavaScript:

import init, { ols_regression } from './linreg_core.js';
await init();

const result = JSON.parse(ols_regression(
    JSON.stringify([2.5, 3.7, 4.2, 5.1, 6.3]),
    JSON.stringify([[1,2,3,4,5], [2,4,5,4,3]]),
    JSON.stringify(["Intercept", "X1", "X2"])
));
console.log("R²:", result.r_squared);

§Regularized Regression

use linreg_core::regularized::{ridge_fit, RidgeFitOptions, lasso_fit, LassoFitOptions};
use linreg_core::linalg::Matrix;

let x = Matrix::new(100, 3, vec![0.0; 300]);
let y = vec![0.0; 100];

// Ridge regression (L2 penalty - shrinks coefficients, handles multicollinearity)
let ridge_result = ridge_fit(&x, &y, &RidgeFitOptions {
    lambda: 1.0,
    intercept: true,
    standardize: true,
    ..Default::default()
})?;

// Lasso regression (L1 penalty — automatic variable selection by zeroing coefficients)
let lasso_result = lasso_fit(&x, &y, &LassoFitOptions {
    lambda: 0.1,
    intercept: true,
    standardize: true,
    ..Default::default()
})?;

§WLS and LOESS

use linreg_core::weighted_regression::wls_regression;
use linreg_core::loess::{loess_fit, LoessOptions};

// Weighted Least Squares — down-weight high-variance observations
let weights = vec![1.0, 2.0, 1.0, 2.0, 1.0];
let wls = wls_regression(
    &[2.5, 3.7, 4.2, 5.1, 6.3],
    &[vec![1.0, 2.0, 3.0, 4.0, 5.0]],
    &weights,
)?;

// LOESS — non-parametric smoothing (single predictor)
let x = vec![0.0, 1.0, 2.0, 3.0, 4.0, 5.0];
let y = vec![1.0, 2.1, 3.9, 8.2, 16.5, 32.1];
let loess = loess_fit(&y, &[x], &LoessOptions::default())?;

§K-Fold Cross Validation

use linreg_core::cross_validation::{kfold_cv_ols, KFoldOptions};

let y = vec![2.5, 3.7, 4.2, 5.1, 6.3, 7.0, 7.5, 8.1];
let x1 = vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0];
let names = vec!["Intercept".to_string(), "X1".to_string()];

let cv = kfold_cv_ols(&y, &[x1], &names, &KFoldOptions {
    n_folds: 5,
    shuffle: true,
    seed: Some(42),
})?;
println!("CV RMSE: {:.4} ± {:.4}", cv.mean_rmse, cv.std_rmse);
println!("CV R²:   {:.4} ± {:.4}", cv.mean_r_squared, cv.std_r_squared);

§Diagnostic Tests

After fitting a model, validate its assumptions:

TestTests ForUse When
diagnostics::rainbow_testLinearityChecking if relationships are linear
diagnostics::harvey_collier_testFunctional formSuspecting model misspecification
diagnostics::reset_testSpecification errorDetecting omitted variables or wrong functional form
diagnostics::breusch_pagan_testHeteroscedasticityVariance changes with predictors
diagnostics::white_testHeteroscedasticityMore general than Breusch-Pagan
diagnostics::shapiro_wilk_testNormalitySmall to moderate samples (n ≤ 5000)
diagnostics::jarque_bera_testNormalityLarge samples, skewness/kurtosis
diagnostics::anderson_darling_testNormalityTail-sensitive, any sample size
diagnostics::durbin_watson_testAutocorrelationTime series or ordered data
diagnostics::breusch_godfrey_testHigher-order autocorrelationDetecting serial correlation at multiple lags
diagnostics::cooks_distance_testInfluential pointsIdentifying high-impact observations
diagnostics::dfbetas_testCoefficient influenceWhich observations drive each coefficient
diagnostics::dffits_testFitted value influenceInfluence of each observation on its own prediction
diagnostics::vif_testMulticollinearityDetecting highly correlated predictors
use linreg_core::diagnostics::{rainbow_test, breusch_pagan_test, RainbowMethod};

// Rainbow test for linearity
let rainbow = rainbow_test(&y, &[x1.clone(), x2.clone()], 0.5, RainbowMethod::R)?;
if rainbow.r_result.as_ref().map_or(false, |r| r.p_value < 0.05) {
    println!("Warning: relationship may be non-linear");
}

// Breusch-Pagan test for heteroscedasticity
let bp = breusch_pagan_test(&y, &[x1, x2])?;
if bp.p_value < 0.05 {
    println!("Warning: residuals have non-constant variance");
}

§Feature Flags

FlagDefaultDescription
wasmYesEnables WASM bindings and browser support
pythonNoEnables Python bindings via PyO3 (built with maturin)
validationNoIncludes test data for validation tests

For native-only builds (smaller binary, no WASM deps):

linreg-core = { version = "0.8", default-features = false }

§Why This Library?

  • Zero dependencies — No nalgebra, no statrs, no ndarray. Pure Rust.
  • Validated — Outputs match R’s lm(), glmnet, and Python’s statsmodels
  • WASM-ready — Same code runs natively and in browsers
  • Python-ready — PyO3 bindings expose the full API to Python
  • Permissive license — MIT OR Apache-2.0

§Module Structure

  • core — OLS regression, coefficients, residuals, VIF, AIC/BIC
  • regularized — Ridge, Lasso, Elastic Net, regularization paths
  • polynomial — Polynomial regression of any degree with centering/standardization
  • weighted_regression — Weighted Least Squares (WLS)
  • loess — Locally weighted scatterplot smoothing
  • cross_validation — K-Fold Cross Validation for all regression types
  • prediction_intervals — Prediction and confidence intervals for all model types
  • feature_importance — Standardized coefficients, SHAP, permutation importance, VIF ranking
  • diagnostics — 14 statistical tests (linearity, heteroscedasticity, normality, autocorrelation, influence)
  • serialization — Model save/load to JSON (native Rust)
  • stats — Descriptive statistics utilities
  • distributions — Statistical distributions (t, F, χ², normal, beta, gamma)
  • linalg — Matrix operations, QR decomposition, linear system solver
  • error — Error types and Result alias

§Disclaimer

This library is under active development and has not reached 1.0 stability. While outputs are validated against R and Python implementations, do not use this library for critical applications (medical, financial, safety-critical systems) without independent verification. See the LICENSE for full terms. The software is provided “as is” without warranty of any kind.

Re-exports§

pub use core::aic;
pub use core::aic_python;
pub use core::bic;
pub use core::bic_python;
pub use core::log_likelihood;
pub use core::RegressionOutput;
pub use core::VifResult;
pub use prediction_intervals::compute_from_fit;
pub use prediction_intervals::elastic_net_prediction_intervals;
pub use prediction_intervals::lasso_prediction_intervals;
pub use prediction_intervals::prediction_intervals;
pub use prediction_intervals::ridge_prediction_intervals;
pub use prediction_intervals::PredictionIntervalOutput;
pub use diagnostics::BGTestType;
pub use diagnostics::BreuschGodfreyResult;
pub use diagnostics::CooksDistanceResult;
pub use diagnostics::DiagnosticTestResult;
pub use diagnostics::RainbowMethod;
pub use diagnostics::RainbowSingleResult;
pub use diagnostics::RainbowTestOutput;
pub use diagnostics::ResetType;
pub use diagnostics::WhiteMethod;
pub use diagnostics::WhiteSingleResult;
pub use diagnostics::WhiteTestOutput;
pub use cross_validation::CVResult;
pub use cross_validation::FoldResult;
pub use cross_validation::KFoldOptions;
pub use cross_validation::kfold_cv_elastic_net;
pub use cross_validation::kfold_cv_lasso;
pub use cross_validation::kfold_cv_ols;
pub use cross_validation::kfold_cv_ridge;
pub use loess::loess_fit;
pub use loess::LoessFit;
pub use loess::LoessOptions;
pub use polynomial::polynomial_regression;
pub use polynomial::predict as polynomial_predict;
pub use polynomial::predict as polynomial_predict;
pub use polynomial::PolynomialFit;
pub use polynomial::PolynomialOptions;
pub use weighted_regression::wls_regression;
pub use weighted_regression::WlsFit;
pub use feature_importance::PermutationImportanceOptions;
pub use feature_importance::PermutationImportanceOutput;
pub use feature_importance::ShapOutput;
pub use feature_importance::StandardizedCoefficientsOutput;
pub use feature_importance::VifRankingOutput;
pub use feature_importance::permutation_importance_elastic_net;
pub use feature_importance::permutation_importance_lasso;
pub use feature_importance::permutation_importance_loess;
pub use feature_importance::permutation_importance_ols;
pub use feature_importance::permutation_importance_ols_named;
pub use feature_importance::permutation_importance_ridge;
pub use feature_importance::shap_values_elastic_net;
pub use feature_importance::shap_values_lasso;
pub use feature_importance::shap_values_linear;
pub use feature_importance::shap_values_linear_named;
pub use feature_importance::shap_values_polynomial;
pub use feature_importance::shap_values_ridge;
pub use feature_importance::standardized_coefficients;
pub use feature_importance::standardized_coefficients_named;
pub use feature_importance::vif_ranking;
pub use diagnostics::rainbow_test as rainbow_test_core;
pub use diagnostics::white_test as white_test_core;
pub use error::error_json;
pub use error::error_to_json;
pub use error::Error;
pub use error::Result;
pub use stats::correlation;
pub use stats::max;
pub use stats::mean;
pub use stats::median;
pub use stats::min;
pub use stats::mode;
pub use stats::quantile;
pub use stats::range;
pub use stats::stddev;
pub use stats::sum;
pub use stats::variance;
pub use stats::FiveNumberSummary;
pub use stats::ModeResult;

Modules§

core
Core OLS regression implementation.
cross_validation
K-Fold Cross Validation for linear regression models.
diagnostics
Statistical diagnostic tests for linear regression assumptions.
distributions
Custom statistical special functions and distribution utilities (CDF/SF/quantiles), primarily to avoid pulling in statrs for regression diagnostics.
error
Error types for the linear regression library.
feature_importance
Feature importance metrics for regression models.
linalg
Minimal Linear Algebra module to replace nalgebra dependency.
loess
LOESS (Locally Estimated Scatterplot Smoothing)
polynomial
Polynomial Regression
prediction_intervals
Prediction Intervals Module
regularized
Ridge and Lasso regression (glmnet-compatible implementations).
serialization
Model serialization module for saving and loading regression models.
stats
Basic statistical utility functions.
wasm
WASM-specific bindings for linreg-core
weighted_regression
Weighted regression methods

Macros§

impl_serialization
Macro to generate ModelSave and ModelLoad implementations for a model type.