linreg-core 0.6.0

# linreg-core


[![CI](https://github.com/jesse-anderson/linreg-core/actions/workflows/ci.yml/badge.svg)](https://github.com/jesse-anderson/linreg-core/actions/workflows/ci.yml)
[![Coverage](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/jesse-anderson/linreg-core/main/.github/coverage-badge.json)](https://github.com/jesse-anderson/linreg-core/actions/workflows/ci.yml)
[![License](https://img.shields.io/badge/license-MIT%20OR%20Apache--2.0-blue)](LICENSE-MIT)
[![Crates.io](https://img.shields.io/crates/v/linreg-core?color=orange)](https://crates.io/crates/linreg-core)
[![npm](https://img.shields.io/npm/v/linreg-core?color=red)](https://www.npmjs.com/package/linreg-core)
[![PyPI](https://img.shields.io/pypi/v/linreg-core)](https://pypi.org/project/linreg-core/)
[![docs.rs](https://img.shields.io/badge/docs.rs-linreg__core-green)](https://docs.rs/linreg-core)


A lightweight, self-contained linear regression library written in Rust. Compiles to WebAssembly for browser use, Python bindings via PyO3, or runs as a native Rust crate.

**Key design principle:** All linear algebra and statistical distribution functions are implemented from scratch — no external math libraries required. This keeps binary sizes small and makes the crate highly portable.

---

## Table of Contents


| Section | Description |
|---------|-------------|
| [Features](#features) | Regression methods, model statistics, diagnostic tests |
| [Rust Usage](#rust-usage) | Native Rust crate usage |
| [WebAssembly Usage](#webassembly-usage) | Browser/JavaScript usage |
| [Python Usage](#python-usage) | Python bindings via PyO3 |
| [Feature Flags](#feature-flags) | Build configuration options |
| [Validation](#validation) | Testing and verification |
| [Implementation Notes](#implementation-notes) | Technical details |

---

## Features


### Regression Methods

- **OLS Regression:** Coefficients, standard errors, t-statistics, p-values, confidence intervals, model selection criteria (AIC, BIC, log-likelihood)
- **Ridge Regression:** L2-regularized regression with optional standardization, effective degrees of freedom, model selection criteria
- **Lasso Regression:** L1-regularized regression via coordinate descent with automatic variable selection, convergence tracking, model selection criteria
- **Elastic Net:** Combined L1 + L2 regularization for variable selection with multicollinearity handling, active set convergence, model selection criteria
- **LOESS:** Locally estimated scatterplot smoothing for non-parametric curve fitting with configurable span, polynomial degree, and robust fitting
- **WLS (Weighted Least Squares):** Regression with observation weights for heteroscedastic data, includes confidence intervals
- **Lambda Path Generation:** Create regularization paths for cross-validation

### Model Statistics

- **Fit Metrics:** R-squared, Adjusted R-squared, F-statistic, F-test p-value
- **Error Metrics:** Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE)
- **Model Selection:** Log-likelihood, AIC (Akaike Information Criterion), BIC (Bayesian Information Criterion)
- **Residuals:** Raw residuals, standardized residuals, fitted values, leverage (hat matrix diagonal)
- **Multicollinearity:** Variance Inflation Factor (VIF) for each predictor

### Diagnostic Tests

| Category | Tests |
|----------|-------|
| **Linearity** | Rainbow Test, Harvey-Collier Test, RESET Test |
| **Heteroscedasticity** | Breusch-Pagan (Koenker variant), White Test (R & Python methods) |
| **Normality** | Jarque-Bera, Shapiro-Wilk (n ≤ 5000), Anderson-Darling |
| **Autocorrelation** | Durbin-Watson, Breusch-Godfrey (higher-order) |
| **Multicollinearity** | Variance Inflation Factor (VIF) |
| **Influence** | Cook's Distance, DFBETAS, DFFITS |

---

## Rust Usage


Add to your `Cargo.toml`:

```toml
[dependencies]
linreg-core = { version = "0.6", default-features = false }
```

### OLS Regression (Rust)


```rust
use linreg_core::core::ols_regression;

fn main() -> Result<(), linreg_core::Error> {
    let y = vec![2.5, 3.7, 4.2, 5.1, 6.3];
    let x = vec![vec![1.0, 2.0, 3.0, 4.0, 5.0]];
    let names = vec!["Intercept".to_string(), "X1".to_string()];

    let result = ols_regression(&y, &x, &names)?;

    println!("Coefficients: {:?}", result.coefficients);
    println!("R-squared: {:.4}", result.r_squared);
    println!("F-statistic: {:.4}", result.f_statistic);
    println!("Log-likelihood: {:.4}", result.log_likelihood);
    println!("AIC: {:.4}", result.aic);
    println!("BIC: {:.4}", result.bic);

    Ok(())
}
```

### Ridge Regression (Rust)


```rust,no_run
use linreg_core::regularized::{ridge_fit, RidgeFitOptions};
use linreg_core::linalg::Matrix;

fn main() -> Result<(), linreg_core::Error> {
    let y = vec![2.5, 3.7, 4.2, 5.1, 6.3];
    let x = Matrix::new(5, 2, vec![
        1.0, 1.0,  // row 0: intercept, x1
        1.0, 2.0,  // row 1
        1.0, 3.0,  // row 2
        1.0, 4.0,  // row 3
        1.0, 5.0,  // row 4
    ]);

    let options = RidgeFitOptions {
        lambda: 1.0,
        standardize: true,
        intercept: true,
    };

    let result = ridge_fit(&x, &y, &options)?;
    println!("Intercept: {}", result.intercept);
    println!("Coefficients: {:?}", result.coefficients);
    println!("R-squared: {:.4}", result.r_squared);
    println!("Effective degrees of freedom: {:.2}", result.effective_df);
    println!("AIC: {:.4}", result.aic);
    println!("BIC: {:.4}", result.bic);

    Ok(())
}
```

### Lasso Regression (Rust)


```rust,no_run
use linreg_core::regularized::{lasso_fit, LassoFitOptions};
use linreg_core::linalg::Matrix;

fn main() -> Result<(), linreg_core::Error> {
    let y = vec![2.5, 3.7, 4.2, 5.1, 6.3];
    let x = Matrix::new(5, 3, vec![
        1.0, 1.0, 0.5,
        1.0, 2.0, 1.0,
        1.0, 3.0, 1.5,
        1.0, 4.0, 2.0,
        1.0, 5.0, 2.5,
    ]);

    let options = LassoFitOptions {
        lambda: 0.1,
        standardize: true,
        intercept: true,
        ..Default::default()
    };

    let result = lasso_fit(&x, &y, &options)?;
    println!("Intercept: {}", result.intercept);
    println!("Coefficients: {:?}", result.coefficients);
    println!("Non-zero coefficients: {}", result.n_nonzero);
    println!("AIC: {:.4}", result.aic);
    println!("BIC: {:.4}", result.bic);

    Ok(())
}
```

### Elastic Net Regression (Rust)


```rust,no_run
use linreg_core::regularized::{elastic_net_fit, ElasticNetOptions};
use linreg_core::linalg::Matrix;

fn main() -> Result<(), linreg_core::Error> {
    let y = vec![2.5, 3.7, 4.2, 5.1, 6.3];
    let x = Matrix::new(5, 3, vec![
        1.0, 1.0, 0.5,
        1.0, 2.0, 1.0,
        1.0, 3.0, 1.5,
        1.0, 4.0, 2.0,
        1.0, 5.0, 2.5,
    ]);

    let options = ElasticNetOptions {
        lambda: 0.1,
        alpha: 0.5,   // 0 = Ridge, 1 = Lasso, 0.5 = balanced
        standardize: true,
        intercept: true,
        ..Default::default()
    };

    let result = elastic_net_fit(&x, &y, &options)?;
    println!("Intercept: {}", result.intercept);
    println!("Coefficients: {:?}", result.coefficients);
    println!("Non-zero coefficients: {}", result.n_nonzero);
    println!("AIC: {:.4}", result.aic);
    println!("BIC: {:.4}", result.bic);

    Ok(())
}
```

### Diagnostic Tests (Rust)


```rust
use linreg_core::diagnostics::{
    breusch_pagan_test, durbin_watson_test, jarque_bera_test,
    shapiro_wilk_test, rainbow_test, harvey_collier_test,
    white_test, anderson_darling_test, breusch_godfrey_test,
    cooks_distance_test, dfbetas_test, dffits_test, vif_test,
    reset_test, BGTestType, RainbowMethod, ResetType, WhiteMethod
};

fn main() -> Result<(), linreg_core::Error> {
    let y = vec![/* your data */];
    let x = vec![vec![/* predictor 1 */], vec![/* predictor 2 */]];

    // Heteroscedasticity tests
    let bp = breusch_pagan_test(&y, &x)?;
    println!("Breusch-Pagan: LM={:.4}, p={:.4}", bp.statistic, bp.p_value);

    let white = white_test(&y, &x, WhiteMethod::R)?;
    println!("White: statistic={:.4}, p={:.4}", white.statistic, white.p_value);

    // Autocorrelation tests
    let dw = durbin_watson_test(&y, &x)?;
    println!("Durbin-Watson: {:.4}", dw.statistic);

    let bg = breusch_godfrey_test(&y, &x, 2, BGTestType::Chisq)?;
    println!("Breusch-Godfrey (order 2): statistic={:.4}, p={:.4}", bg.statistic, bg.p_value);

    // Normality tests
    let jb = jarque_bera_test(&y, &x)?;
    println!("Jarque-Bera: JB={:.4}, p={:.4}", jb.statistic, jb.p_value);

    let sw = shapiro_wilk_test(&y, &x)?;
    println!("Shapiro-Wilk: W={:.4}, p={:.4}", sw.statistic, sw.p_value);

    let ad = anderson_darling_test(&y, &x)?;
    println!("Anderson-Darling: A={:.4}, p={:.4}", ad.statistic, ad.p_value);

    // Linearity tests
    let rainbow = rainbow_test(&y, &x, 0.5, RainbowMethod::R)?;
    println!("Rainbow: F={:.4}, p={:.4}",
        rainbow.r_result.as_ref().unwrap().statistic,
        rainbow.r_result.as_ref().unwrap().p_value);

    let hc = harvey_collier_test(&y, &x)?;
    println!("Harvey-Collier: t={:.4}, p={:.4}", hc.statistic, hc.p_value);

    let reset = reset_test(&y, &x, &[2, 3], ResetType::Fitted)?;
    println!("RESET: F={:.4}, p={:.4}", reset.f_statistic, reset.p_value);

    // Influence diagnostics
    let cd = cooks_distance_test(&y, &x)?;
    println!("Cook's Distance: {} influential points", cd.influential_4_over_n.len());

    let dfbetas = dfbetas_test(&y, &x)?;
    println!("DFBETAS: {} influential observations", dfbetas.influential_observations.len());

    let dffits = dffits_test(&y, &x)?;
    println!("DFFITS: {} influential observations", dffits.influential_observations.len());

    // Multicollinearity
    let vif = vif_test(&y, &x)?;
    println!("VIF: {:?}", vif.vif_values);

    Ok(())
}
```

### WLS Regression (Rust)


```rust,no_run
use linreg_core::weighted_regression::wls_regression;

fn main() -> Result<(), linreg_core::Error> {
    let y = vec![2.0, 4.0, 6.0, 8.0, 10.0];
    let x1 = vec![1.0, 2.0, 3.0, 4.0, 5.0];

    // Equal weights = OLS
    let weights = vec![1.0, 1.0, 1.0, 1.0, 1.0];

    let fit = wls_regression(&y, &[x1], &weights)?;

    println!("Intercept: {} (SE: {}, t: {}, p: {})",
        fit.coefficients[0],
        fit.standard_errors[0],
        fit.t_statistics[0],
        fit.p_values[0]
    );
    println!("F-statistic: {} (p: {})", fit.f_statistic, fit.f_p_value);
    println!("R-squared: {:.4}", fit.r_squared);

    // Access confidence intervals
    for (i, (&coef, &lower, &upper)) in fit.coefficients.iter()
        .zip(fit.conf_int_lower.iter())
        .zip(fit.conf_int_upper.iter())
        .enumerate()
    {
        println!("Coefficient {}: [{}, {}]", i, lower, upper);
    }

    Ok(())
}
```

### LOESS Regression (Rust)


```rust,no_run
use linreg_core::loess::{loess_fit, LoessOptions};

fn main() -> Result<(), linreg_core::Error> {
    // Single predictor only
    let x = vec![0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0];
    let y = vec![1.0, 3.5, 4.8, 6.2, 8.5, 11.0, 13.2, 14.8, 17.5, 19.0, 22.0];

    // Default options: span=0.75, degree=2, robust iterations=0
    let options = LoessOptions::default();

    let result = loess_fit(&y, &[x], &options)?;

    println!("Fitted values: {:?}", result.fitted_values);
    println!("Residuals: {:?}", result.residuals);

    Ok(())
}
```

**Custom LOESS options:**

```rust,no_run
use linreg_core::loess::{loess_fit, LoessOptions, LoessSurface};

let options = LoessOptions {
    span: 0.5,              // Smoothing parameter (0-1, smaller = less smooth)
    degree: 1,              // Polynomial degree (0=constant, 1=linear, 2=quadratic)
    surface: LoessSurface::Direct,  // Note: only "direct" is currently supported; "interpolate" is planned
    robust_iterations: 3,   // Number of robust fitting iterations (0 = disabled)
};

let result = loess_fit(&y, &[x], &options)?;
```

### Lambda Path Generation (Rust)


```rust,no_run
use linreg_core::regularized::{make_lambda_path, LambdaPathOptions};
use linreg_core::linalg::Matrix;

let x = Matrix::new(100, 5, vec![0.0; 500]);
let y = vec![0.0; 100];

let options = LambdaPathOptions {
    nlambda: 100,
    lambda_min_ratio: Some(0.01),
    alpha: 1.0,  // Lasso
    ..Default::default()
};

let lambdas = make_lambda_path(&x, &y, &options, None, Some(0));

for &lambda in lambdas.iter() {
    // Fit model with this lambda
}
```

---

## WebAssembly Usage


Build with wasm-pack:

```bash
wasm-pack build --release --target web
```

### OLS Regression (WASM)


```javascript
import init, { ols_regression } from './pkg/linreg_core.js';

async function run() {
    await init();

    const y = [1, 2, 3, 4, 5];
    const x = [[1, 2, 3, 4, 5]];
    const names = ["Intercept", "X1"];

    const resultJson = ols_regression(
        JSON.stringify(y),
        JSON.stringify(x),
        JSON.stringify(names)
    );

    const result = JSON.parse(resultJson);
    console.log("Coefficients:", result.coefficients);
    console.log("R-squared:", result.r_squared);
    console.log("Log-likelihood:", result.log_likelihood);
    console.log("AIC:", result.aic);
    console.log("BIC:", result.bic);
}

run();
```

### Ridge Regression (WASM)


```javascript
const result = JSON.parse(ridge_regression(
    JSON.stringify(y),
    JSON.stringify(x),
    JSON.stringify(["Intercept", "X1", "X2"]),
    1.0,      // lambda
    true      // standardize
));

console.log("Coefficients:", result.coefficients);
console.log("R-squared:", result.r_squared);
console.log("Effective degrees of freedom:", result.effective_df);
console.log("AIC:", result.aic);
console.log("BIC:", result.bic);
```

### Lasso Regression (WASM)


```javascript
const result = JSON.parse(lasso_regression(
    JSON.stringify(y),
    JSON.stringify(x),
    JSON.stringify(["Intercept", "X1", "X2"]),
    0.1,      // lambda
    true,     // standardize
    100000,   // max_iter
    1e-7      // tol
));

console.log("Coefficients:", result.coefficients);
console.log("Non-zero coefficients:", result.n_nonzero);
console.log("AIC:", result.aic);
console.log("BIC:", result.bic);
```

### Elastic Net Regression (WASM)


```javascript
const result = JSON.parse(elastic_net_regression(
    JSON.stringify(y),
    JSON.stringify(x),
    JSON.stringify(["Intercept", "X1", "X2"]),
    0.1,      // lambda
    0.5,      // alpha (0 = Ridge, 1 = Lasso, 0.5 = balanced)
    true,     // standardize
    100000,   // max_iter
    1e-7      // tol
));

console.log("Coefficients:", result.coefficients);
console.log("Non-zero coefficients:", result.n_nonzero);
console.log("AIC:", result.aic);
console.log("BIC:", result.bic);
```

### Lambda Path Generation (WASM)


```javascript
const path = JSON.parse(make_lambda_path(
    JSON.stringify(y),
    JSON.stringify(x),
    100,              // n_lambda
    0.01              // lambda_min_ratio (as fraction of lambda_max)
));

console.log("Lambda sequence:", path.lambda_path);
console.log("Lambda max:", path.lambda_max);
```

### WLS Regression (WASM)


```javascript
const result = JSON.parse(wls_regression(
    JSON.stringify([2, 4, 6, 8, 10]),
    JSON.stringify([[1, 2, 3, 4, 5]]),
    JSON.stringify([1, 1, 1, 1, 1])  // weights (equal weights = OLS)
));

console.log("Coefficients:", result.coefficients);
console.log("Standard errors:", result.standard_errors);
console.log("P-values:", result.p_values);
console.log("R-squared:", result.r_squared);
console.log("F-statistic:", result.f_statistic);
console.log("Confidence intervals (lower):", result.conf_int_lower);
console.log("Confidence intervals (upper):", result.conf_int_upper);
```

### LOESS Regression (WASM)


```javascript
const result = JSON.parse(loess_fit(
    JSON.stringify(y),
    JSON.stringify(x[0]),    // Single predictor only (flattened array)
    0.5,      // span (smoothing parameter: 0-1)
    1,        // degree (0=constant, 1=linear, 2=quadratic)
    "direct", // surface method ("direct" only; "interpolate" is planned)
    0         // robust iterations (0=disabled, >0=number of iterations)
));

console.log("Fitted values:", result.fitted_values);
console.log("Residuals:", result.residuals);
```

### Diagnostic Tests (WASM)


```javascript
// Rainbow test
const rainbow = JSON.parse(rainbow_test(
    JSON.stringify(y),
    JSON.stringify(x),
    0.5,      // fraction
    "r"       // method: "r", "python", or "both"
));

// Harvey-Collier test
const hc = JSON.parse(harvey_collier_test(
    JSON.stringify(y),
    JSON.stringify(x)
));

// Breusch-Pagan test
const bp = JSON.parse(breusch_pagan_test(
    JSON.stringify(y),
    JSON.stringify(x)
));

// White test (method selection: "r", "python", or "both")
const white = JSON.parse(white_test(
    JSON.stringify(y),
    JSON.stringify(x),
    "r"
));

// White test - R-specific method
const whiteR = JSON.parse(r_white_test(
    JSON.stringify(y),
    JSON.stringify(x)
));

// White test - Python-specific method
const whitePy = JSON.parse(python_white_test(
    JSON.stringify(y),
    JSON.stringify(x)
));

// Jarque-Bera test
const jb = JSON.parse(jarque_bera_test(
    JSON.stringify(y),
    JSON.stringify(x)
));

// Durbin-Watson test
const dw = JSON.parse(durbin_watson_test(
    JSON.stringify(y),
    JSON.stringify(x)
));

// Shapiro-Wilk test
const sw = JSON.parse(shapiro_wilk_test(
    JSON.stringify(y),
    JSON.stringify(x)
));

// Anderson-Darling test
const ad = JSON.parse(anderson_darling_test(
    JSON.stringify(y),
    JSON.stringify(x)
));

// Cook's Distance
const cd = JSON.parse(cooks_distance_test(
    JSON.stringify(y),
    JSON.stringify(x)
));

// DFBETAS (influence on coefficients)
const dfbetas = JSON.parse(dfbetas_test(
    JSON.stringify(y),
    JSON.stringify(x)
));

// DFFITS (influence on fitted values)
const dffits = JSON.parse(dffits_test(
    JSON.stringify(y),
    JSON.stringify(x)
));

// VIF test (multicollinearity)
const vif = JSON.parse(vif_test(
    JSON.stringify(y),
    JSON.stringify(x)
));
console.log("VIF values:", vif.vif_values);

// RESET test (functional form)
const reset = JSON.parse(reset_test(
    JSON.stringify(y),
    JSON.stringify(x),
    JSON.stringify([2, 3]),  // powers
    "fitted"                  // type: "fitted", "regressor", or "princomp"
));

// Breusch-Godfrey test (higher-order autocorrelation)
const bg = JSON.parse(breusch_godfrey_test(
    JSON.stringify(y),
    JSON.stringify(x),
    1,        // order
    "chisq"   // test_type: "chisq" or "f"
));
```

### Statistical Utilities (WASM)


```javascript
// Student's t CDF: P(T <= t)
const tCDF = get_t_cdf(1.96, 20);

// Critical t-value for two-tailed test
const tCrit = get_t_critical(0.05, 20);

// Normal inverse CDF (probit)
const zScore = get_normal_inverse(0.975);

// Descriptive statistics (all return JSON strings)
const mean = JSON.parse(stats_mean(JSON.stringify([1, 2, 3, 4, 5])));
const variance = JSON.parse(stats_variance(JSON.stringify([1, 2, 3, 4, 5])));
const stddev = JSON.parse(stats_stddev(JSON.stringify([1, 2, 3, 4, 5])));
const median = JSON.parse(stats_median(JSON.stringify([1, 2, 3, 4, 5])));
const quantile = JSON.parse(stats_quantile(JSON.stringify([1, 2, 3, 4, 5]), 0.5));
const correlation = JSON.parse(stats_correlation(
    JSON.stringify([1, 2, 3, 4, 5]),
    JSON.stringify([2, 4, 6, 8, 10])
));
```

### CSV Parsing (WASM)


```javascript
const csv = parse_csv(csvContent);
const parsed = JSON.parse(csv);
console.log("Headers:", parsed.headers);
console.log("Numeric columns:", parsed.numeric_columns);
```

### Helper Functions (WASM)


```javascript
const version = get_version();  // e.g., "0.5.0"
const msg = test();             // "Rust WASM is working!"
```

### Domain Security (WASM)


Optional domain restriction via build-time environment variable:

```bash
LINREG_DOMAIN_RESTRICT=example.com,mysite.com wasm-pack build --release --target web
```

When NOT set (default), all domains are allowed.

---

## Python Usage


Install from PyPI:

```bash
pip install linreg-core
```

### Quick Start (Python)


The recommended way to use `linreg-core` in Python is with native types (lists or numpy arrays):

```python
import linreg_core

# Works with Python lists

y = [1, 2, 3, 4, 5]
x = [[1, 2, 3, 4, 5]]
names = ["Intercept", "X1"]

result = linreg_core.ols_regression(y, x, names)

# Access attributes directly

print(f"R²: {result.r_squared}")
print(f"Coefficients: {result.coefficients}")
print(f"F-statistic: {result.f_statistic}")

# Get a formatted summary

print(result.summary())
```

**With NumPy arrays:**

```python
import numpy as np
import linreg_core

y = np.array([1, 2, 3, 4, 5])
x = np.array([[1, 2, 3, 4, 5]])

result = linreg_core.ols_regression(y, x, ["Intercept", "X1"])
print(result.summary())
```

**Result objects** provide:
- Direct attribute access (`result.r_squared`, `result.coefficients`, `result.aic`, `result.bic`, `result.log_likelihood`)
- `summary()` method for formatted output
- `to_dict()` method for JSON serialization

### OLS Regression (Python)


```python
import linreg_core

y = [1, 2, 3, 4, 5]
x = [[1, 2, 3, 4, 5]]
names = ["Intercept", "X1"]

result = linreg_core.ols_regression(y, x, names)
print(f"Coefficients: {result.coefficients}")
print(f"R-squared: {result.r_squared}")
print(f"F-statistic: {result.f_statistic}")
print(f"Log-likelihood: {result.log_likelihood}")
print(f"AIC: {result.aic}")
print(f"BIC: {result.bic}")
```

### Ridge Regression (Python)


```python
result = linreg_core.ridge_regression(
    y, x, ["Intercept", "X1"],
    1.0,      # lambda
    True      # standardize
)
print(f"Intercept: {result.intercept}")
print(f"Coefficients: {result.coefficients}")
print(f"Effective degrees of freedom: {result.effective_df:.2f}")
print(f"AIC: {result.aic}")
print(f"BIC: {result.bic}")
```

### Lasso Regression (Python)


```python
result = linreg_core.lasso_regression(
    y, x, ["Intercept", "X1"],
    0.1,      # lambda
    True,     # standardize
    100000,   # max_iter
    1e-7      # tol
)
print(f"Intercept: {result.intercept}")
print(f"Coefficients: {result.coefficients}")
print(f"Non-zero: {result.n_nonzero}")
print(f"Converged: {result.converged}")
print(f"AIC: {result.aic}")
print(f"BIC: {result.bic}")
```

### Elastic Net Regression (Python)


```python
result = linreg_core.elastic_net_regression(
    y, x, ["Intercept", "X1"],
    0.1,      # lambda
    0.5,      # alpha (0 = Ridge, 1 = Lasso, 0.5 = balanced)
    True,     # standardize
    100000,   # max_iter
    1e-7      # tol
)
print(f"Intercept: {result.intercept}")
print(f"Coefficients: {result.coefficients}")
print(f"Non-zero: {result.n_nonzero}")
print(f"AIC: {result.aic}")
print(f"BIC: {result.bic}")
```

### LOESS Regression (Python)


```python
result = linreg_core.loess_fit(
    y,           # Single predictor only
    [0.5],       # span (smoothing parameter: 0-1)
    2,           # degree (0=constant, 1=linear, 2=quadratic)
    "direct",    # surface ("direct" only; "interpolate" is planned)
    0            # robust iterations (0=disabled, >0=number of iterations)
)
print(f"Fitted values: {result.fitted_values}")
print(f"Residuals: {result.residuals}")
```

### Lambda Path Generation (Python)


```python
path = linreg_core.make_lambda_path(
    y, x,
    100,              # n_lambda
    0.01              # lambda_min_ratio
)
print(f"Lambda max: {path.lambda_max}")
print(f"Lambda min: {path.lambda_min}")
print(f"Number: {path.n_lambda}")
```

### Diagnostic Tests (Python)


```python
# Breusch-Pagan test (heteroscedasticity)

bp = linreg_core.breusch_pagan_test(y, x)
print(f"Statistic: {bp.statistic}, p-value: {bp.p_value}")

# Harvey-Collier test (linearity)

hc = linreg_core.harvey_collier_test(y, x)

# Rainbow test (linearity) - supports "r", "python", or "both" methods

rainbow = linreg_core.rainbow_test(y, x, 0.5, "r")

# White test - choose method: "r", "python", or "both"

white = linreg_core.white_test(y, x, "r")
# Or use specific method functions

white_r = linreg_core.r_white_test(y, x)
white_py = linreg_core.python_white_test(y, x)

# Jarque-Bera test (normality)

jb = linreg_core.jarque_bera_test(y, x)

# Durbin-Watson test (autocorrelation)

dw = linreg_core.durbin_watson_test(y, x)
print(f"DW statistic: {dw.statistic}")

# Shapiro-Wilk test (normality)

sw = linreg_core.shapiro_wilk_test(y, x)

# Anderson-Darling test (normality)

ad = linreg_core.anderson_darling_test(y, x)

# Cook's Distance (influential observations)

cd = linreg_core.cooks_distance_test(y, x)
print(f"Influential points: {cd.influential_4_over_n}")

# DFBETAS (influence on each coefficient)

dfbetas = linreg_core.dfbetas_test(y, x)
print(f"Threshold: {dfbetas.threshold}")
print(f"Influential obs: {dfbetas.influential_observations}")

# DFFITS (influence on fitted values)

dffits = linreg_core.dffits_test(y, x)
print(f"Threshold: {dffits.threshold}")
print(f"Influential obs: {dffits.influential_observations}")

# RESET test (model specification)

reset = linreg_core.reset_test(y, x, [2, 3], "fitted")

# Breusch-Godfrey test (higher-order autocorrelation)

bg = linreg_core.breusch_godfrey_test(y, x, 1, "chisq")
```

### Statistical Utilities (Python)


```python
# Student's t CDF

t_cdf = linreg_core.get_t_cdf(1.96, 20)

# Critical t-value (two-tailed)

t_crit = linreg_core.get_t_critical(0.05, 20)

# Normal inverse CDF (probit)

z_score = linreg_core.get_normal_inverse(0.975)

# Library version

version = linreg_core.get_version()
```

### Descriptive Statistics (Python)


```python
import numpy as np

# All return float directly (no parsing needed)

mean = linreg_core.stats_mean([1, 2, 3, 4, 5])
variance = linreg_core.stats_variance([1, 2, 3, 4, 5])
stddev = linreg_core.stats_stddev([1, 2, 3, 4, 5])
median = linreg_core.stats_median([1, 2, 3, 4, 5])
quantile = linreg_core.stats_quantile([1, 2, 3, 4, 5], 0.5)
correlation = linreg_core.stats_correlation([1, 2, 3, 4, 5], [2, 4, 6, 8, 10])

# Works with numpy arrays too

mean = linreg_core.stats_mean(np.array([1, 2, 3, 4, 5]))
```

### CSV Parsing (Python)


```python
csv_content = '''name,value,category
Alice,42.5,A
Bob,17.3,B
Charlie,99.9,A'''

result = linreg_core.parse_csv(csv_content)
print(f"Headers: {result.headers}")
print(f"Numeric columns: {result.numeric_columns}")
print(f"Data rows: {result.n_rows}")
```

---

## Feature Flags


| Feature | Default | Description |
|---------|---------|-------------|
| `wasm` | Yes | Enables WASM bindings and browser support |
| `python` | No | Enables Python bindings via PyO3 |
| `validation` | No | Includes test data for validation tests |

For native Rust without WASM overhead:

```toml
linreg-core = { version = "0.6", default-features = false }
```

For Python bindings (built with maturin):

```bash
pip install linreg-core
```

---

## Validation


Results are validated against R (`lmtest`, `car`, `skedastic`, `nortest`, `glmnet`) and Python (`statsmodels`, `scipy`, `sklearn`). See the `verification/` directory for test scripts and reference outputs.

### Running Tests


```bash
# Unit tests

cargo test

# WASM tests

wasm-pack test --node

# All tests including doctests

cargo test --all-features
```

---

## Implementation Notes


### Regularization


The Ridge and Lasso implementations follow the glmnet formulation:

```
minimize (1/(2n)) * Σ(yᵢ - β₀ - xᵢᵀβ)² + λ * [(1 - α) * ||β||₂² / 2 + α * ||β||₁]
```

- **Ridge** (α = 0): Closed-form solution with (X'X + λI)⁻¹X'y
- **Lasso** (α = 1): Coordinate descent algorithm

### Numerical Precision


- QR decomposition used throughout for numerical stability
- Anderson-Darling uses Abramowitz & Stegun 7.1.26 for normal CDF (differs from R's Cephes by ~1e-6)
- Shapiro-Wilk implements Royston's 1995 algorithm matching R's implementation

### Known Limitations


- Harvey-Collier test may fail on high-VIF datasets (VIF > 5) due to numerical instability in recursive residuals
- Shapiro-Wilk limited to n <= 5000 (matching R's limitation)
- White test may differ from R on collinear datasets due to numerical precision in near-singular matrices

---

## Disclaimer


This library is under active development and has not reached 1.0 stability. While outputs are validated against R and Python implementations, **do not use this library for critical applications** (medical, financial, safety-critical systems) without independent verification. See the [LICENSE](LICENSE-MIT) for full terms. The software is provided "as is" without warranty of any kind.

---

## Benchmarks


<details>
<summary><strong>Click to expand v0.6.0 benchmark results</strong></summary>

Benchmark results run on Windows with `cargo bench --no-default-features`. Times are median values.

### Core Regression Benchmarks


| Benchmark | Size (n × p) | Time | Throughput |
|-----------|--------------|------|------------|
| OLS Regression | 10 × 2 | 12.46 µs | 802.71 Kelem/s |
| OLS Regression | 50 × 3 | 53.72 µs | 930.69 Kelem/s |
| OLS Regression | 100 × 5 | 211.09 µs | 473.73 Kelem/s |
| OLS Regression | 500 × 10 | 7.46 ms | 67.04 Kelem/s |
| OLS Regression | 1000 × 20 | 47.81 ms | 20.91 Kelem/s |
| OLS Regression | 5000 × 50 | 2.86 s | 1.75 Kelem/s |
| Ridge Regression | 50 × 3 | 9.61 µs | 5.20 Melem/s |
| Ridge Regression | 100 × 5 | 70.41 µs | 1.42 Melem/s |
| Ridge Regression | 500 × 10 | 842.37 µs | 593.56 Kelem/s |
| Ridge Regression | 1000 × 20 | 1.38 ms | 724.71 Kelem/s |
| Ridge Regression | 5000 × 50 | 10.25 ms | 487.78 Kelem/s |
| Lasso Regression | 50 × 3 | 258.82 µs | 193.18 Kelem/s |
| Lasso Regression | 100 × 5 | 247.89 µs | 403.41 Kelem/s |
| Lasso Regression | 500 × 10 | 3.58 ms | 139.86 Kelem/s |
| Lasso Regression | 1000 × 20 | 1.54 ms | 651.28 Kelem/s |
| Lasso Regression | 5000 × 50 | 12.52 ms | 399.50 Kelem/s |
| Elastic Net Regression | 50 × 3 | 46.15 µs | 1.08 Melem/s |
| Elastic Net Regression | 100 × 5 | 358.07 µs | 279.27 Kelem/s |
| Elastic Net Regression | 500 × 10 | 1.61 ms | 310.18 Kelem/s |
| Elastic Net Regression | 1000 × 20 | 1.60 ms | 623.66 Kelem/s |
| Elastic Net Regression | 5000 × 50 | 12.57 ms | 397.77 Kelem/s |
| WLS Regression | 50 × 3 | 32.92 µs | 1.52 Melem/s |
| WLS Regression | 100 × 5 | 155.30 µs | 643.93 Kelem/s |
| WLS Regression | 500 × 10 | 6.63 ms | 75.37 Kelem/s |
| WLS Regression | 1000 × 20 | 42.68 ms | 23.43 Kelem/s |
| WLS Regression | 5000 × 50 | 2.64 s | 1.89 Kelem/s |
| LOESS Fit | 50 × 1 | 132.83 µs | 376.42 Kelem/s |
| LOESS Fit | 100 × 1 | 1.16 ms | 86.00 Kelem/s |
| LOESS Fit | 500 × 1 | 28.42 ms | 17.59 Kelem/s |
| LOESS Fit | 1000 × 1 | 113.00 ms | 8.85 Kelem/s |
| LOESS Fit | 100 × 2 | 7.10 ms | 14.09 Kelem/s |
| LOESS Fit | 500 × 2 | 1.05 s | 476.19 elem/s |

### Lambda Path & Elastic Net Path Benchmarks


| Benchmark | Size (n × p) | Time | Throughput |
|-----------|--------------|------|------------|
| Elastic Net Path | 100 × 5 | 198.60 ms | 503.52 elem/s |
| Elastic Net Path | 500 × 10 | 69.46 ms | 7.20 Kelem/s |
| Elastic Net Path | 1000 × 20 | 39.08 ms | 25.59 Kelem/s |
| Make Lambda Path | 100 × 5 | 1.09 µs | 91.58 Melem/s |
| Make Lambda Path | 500 × 10 | 8.10 µs | 61.70 Melem/s |
| Make Lambda Path | 1000 × 20 | 29.96 µs | 33.37 Melem/s |
| Make Lambda Path | 5000 × 50 | 424.18 µs | 11.79 Melem/s |

### Diagnostic Test Benchmarks


| Benchmark | Size (n × p) | Time |
|-----------|--------------|------|
| Rainbow Test | 50 × 3 | 40.34 µs |
| Rainbow Test | 100 × 5 | 187.94 µs |
| Rainbow Test | 500 × 10 | 8.63 ms |
| Rainbow Test | 1000 × 20 | 60.09 ms |
| Rainbow Test | 5000 × 50 | 3.45 s |
| Harvey-Collier Test | 50 × 1 | 15.26 µs |
| Harvey-Collier Test | 100 × 1 | 30.32 µs |
| Harvey-Collier Test | 500 × 1 | 138.44 µs |
| Harvey-Collier Test | 1000 × 1 | 298.33 µs |
| Breusch-Pagan Test | 50 × 3 | 58.07 µs |
| Breusch-Pagan Test | 100 × 5 | 296.74 µs |
| Breusch-Pagan Test | 500 × 10 | 13.79 ms |
| Breusch-Pagan Test | 1000 × 20 | 96.49 ms |
| Breusch-Pagan Test | 5000 × 50 | 5.56 s |
| White Test | 50 × 3 | 14.31 µs |
| White Test | 100 × 5 | 44.25 µs |
| White Test | 500 × 10 | 669.40 µs |
| White Test | 1000 × 20 | 4.89 ms |
| Jarque-Bera Test | 50 × 3 | 30.13 µs |
| Jarque-Bera Test | 100 × 5 | 149.29 µs |
| Jarque-Bera Test | 500 × 10 | 6.64 ms |
| Jarque-Bera Test | 1000 × 20 | 47.89 ms |
| Jarque-Bera Test | 5000 × 50 | 2.75 s |
| Durbin-Watson Test | 50 × 3 | 31.80 µs |
| Durbin-Watson Test | 100 × 5 | 152.56 µs |
| Durbin-Watson Test | 500 × 10 | 6.87 ms |
| Durbin-Watson Test | 1000 × 20 | 48.65 ms |
| Durbin-Watson Test | 5000 × 50 | 2.76 s |
| Breusch-Godfrey Test | 50 × 3 | 71.73 µs |
| Breusch-Godfrey Test | 100 × 5 | 348.94 µs |
| Breusch-Godfrey Test | 500 × 10 | 14.77 ms |
| Breusch-Godfrey Test | 1000 × 20 | 100.08 ms |
| Breusch-Godfrey Test | 5000 × 50 | 5.64 s |
| Shapiro-Wilk Test | 10 × 2 | 2.04 µs |
| Shapiro-Wilk Test | 50 × 3 | 4.87 µs |
| Shapiro-Wilk Test | 100 × 5 | 10.67 µs |
| Shapiro-Wilk Test | 500 × 10 | 110.02 µs |
| Shapiro-Wilk Test | 1000 × 20 | 635.13 µs |
| Shapiro-Wilk Test | 5000 × 50 | 17.53 ms |
| Anderson-Darling Test | 50 × 3 | 34.02 µs |
| Anderson-Darling Test | 100 × 5 | 162.28 µs |
| Anderson-Darling Test | 500 × 10 | 6.95 ms |
| Anderson-Darling Test | 1000 × 20 | 48.15 ms |
| Anderson-Darling Test | 5000 × 50 | 2.78 s |
| Cook's Distance Test | 50 × 3 | 64.52 µs |
| Cook's Distance Test | 100 × 5 | 297.69 µs |
| Cook's Distance Test | 500 × 10 | 12.73 ms |
| Cook's Distance Test | 1000 × 20 | 94.02 ms |
| Cook's Distance Test | 5000 × 50 | 5.31 s |
| DFBETAS Test | 50 × 3 | 46.34 µs |
| DFBETAS Test | 100 × 5 | 185.52 µs |
| DFBETAS Test | 500 × 10 | 7.04 ms |
| DFBETAS Test | 1000 × 20 | 49.68 ms |
| DFFITS Test | 50 × 3 | 33.56 µs |
| DFFITS Test | 100 × 5 | 157.62 µs |
| DFFITS Test | 500 × 10 | 6.82 ms |
| DFFITS Test | 1000 × 20 | 48.35 ms |
| VIF Test | 50 × 3 | 5.36 µs |
| VIF Test | 100 × 5 | 12.68 µs |
| VIF Test | 500 × 10 | 128.04 µs |
| VIF Test | 1000 × 20 | 807.30 µs |
| VIF Test | 5000 × 50 | 26.33 ms |
| RESET Test | 50 × 3 | 77.85 µs |
| RESET Test | 100 × 5 | 359.12 µs |
| RESET Test | 500 × 10 | 14.40 ms |
| RESET Test | 1000 × 20 | 100.52 ms |
| RESET Test | 5000 × 50 | 5.67 s |
| Full Diagnostics | 100 × 5 | 2.75 ms |
| Full Diagnostics | 500 × 10 | 104.01 ms |
| Full Diagnostics | 1000 × 20 | 740.52 ms |

### Linear Algebra Benchmarks


| Benchmark | Size | Time |
|-----------|------|------|
| Matrix Transpose | 10 × 10 | 209.50 ns |
| Matrix Transpose | 50 × 50 | 3.67 µs |
| Matrix Transpose | 100 × 100 | 14.92 µs |
| Matrix Transpose | 500 × 500 | 924.23 µs |
| Matrix Transpose | 1000 × 1000 | 5.56 ms |
| Matrix Multiply (matmul) | 10 × 10 × 10 | 1.54 µs |
| Matrix Multiply (matmul) | 50 × 50 × 50 | 144.15 µs |
| Matrix Multiply (matmul) | 100 × 100 × 100 | 1.39 ms |
| Matrix Multiply (matmul) | 200 × 200 × 200 | 11.90 ms |
| Matrix Multiply (matmul) | 1000 × 100 × 100 | 13.94 ms |
| QR Decomposition | 10 × 5 | 1.41 µs |
| QR Decomposition | 50 × 10 | 14.81 µs |
| QR Decomposition | 100 × 20 | 57.61 µs |
| QR Decomposition | 500 × 50 | 2.19 ms |
| QR Decomposition | 1000 × 100 | 19.20 ms |
| QR Decomposition | 5000 × 100 | 1.48 s |
| QR Decomposition | 10000 × 100 | 8.09 s |
| QR Decomposition | 1000 × 500 | 84.48 ms |
| SVD | 10 × 5 | 150.36 µs |
| SVD | 50 × 10 | 505.41 µs |
| SVD | 100 × 20 | 2.80 ms |
| SVD | 500 × 50 | 60.00 ms |
| SVD | 1000 × 100 | 513.35 ms |
| Matrix Invert | 5 × 5 | 877.32 ns |
| Matrix Invert | 10 × 10 | 2.48 µs |
| Matrix Invert | 20 × 20 | 5.46 µs |
| Matrix Invert | 50 × 50 | 31.94 µs |
| Matrix Invert | 100 × 100 | 141.38 µs |
| Matrix Invert | 200 × 200 | 647.03 µs |

### Pressure Benchmarks (Large Datasets)


| Benchmark | Size (n) | Time |
|-----------|----------|------|
| Pressure (OLS + all diagnostics) | 10000 | 11.28 s |

</details>

---

## License


Dual-licensed under [MIT](LICENSE-MIT) or [Apache-2.0](LICENSE-APACHE).