Skip to main content

Module preprocess

Module preprocess 

Source
Expand description

Data preprocessing for regularized regression.

This module provides standardization utilities that match glmnet output behavior:

  • Predictors are centered and scaled (if enabled)
  • The intercept column is not penalized, so it’s handled specially
  • Coefficients can be unstandardized back to the original scale
  • Observation weights are supported for weighted regression

§Standardization Convention

The scaling factor used is sqrt(sum(x²) / n), which gives unit variance under the 1/n convention (matching the glmnet paper).

§Weighted Standardization

When weights are provided, they are first normalized to sum to 1: weights_normalized = w / sum(w). Then weighted means and variances are computed.

Structs§

StandardizationInfo
Information stored during standardization, used to unstandardize coefficients.
StandardizeOptions
Options for standardization.

Functions§

predict
Computes predictions using unstandardized coefficients.
standardize_xy
Standardizes X and y for regularized regression (glmnet-compatible).
unstandardize_coefficients
Unstandardizes coefficients from the standardized space back to original scale.