Module regression

Module regression 

Source
Expand description

Local regression fitting for LOWESS smoothing.

This module implements the local weighted least-squares regression used by the HIGHER-level LOWESS pipeline. It provides a single-point fitter and supporting utilities (weight computation, normalization, fallback policies, and simple diagnostics) designed for robust production use.

Global expectations

  • Inputs x and y are numeric and aligned; many helpers assume x is sorted ascending. Debug-only assertions validate sorting but production code returns safe fallbacks for degenerate inputs.
  • All numeric tolerances are conservative to avoid panics; callers may perform stricter validation upstream for performance or determinism.

Primary parameters and flags

  • x: &[T] — sorted (recommended) independent variable values.
  • y: &[T] — dependent variable values aligned with x.
  • idx: usize — index of the target point to fit (0..n-1).
  • left/right: usize — inclusive window boundaries defining the local neighborhood used for the fit. These are clamped to [0, n-1] by helpers.
  • use_robustness: bool — when true, per-observation robustness weights (from IRLS) are multiplied with kernel weights before normalization.
  • robustness_weights: &[T] — per-observation multiplicative weights from a previous robustness pass. If not used, pass a slice of ones.
  • weights: &mut [T] — scratch buffer for computed (unnormalized) weights; must be length n (or at least cover the positions accessed). The buffer is normalized in-place prior to regression.
  • weight_fn: WeightFunction — kernel used to compute distance-based weights (Tricube, Epanechnikov, Gaussian, etc.). Bounded kernels support a fast short-circuit for |u| >= 1.
  • zero_weight_fallback: ZeroWeightFallback — policy applied when the local sum of computed weights is zero. Options:
    • UseLocalMean — return the (unweighted) mean over [left..=right].
    • ReturnOriginal — return y[idx].
    • ReturnNone — propagate failure (caller decides).

WeightParams specifics

  • x_current: T — the x location being fitted.
  • bandwidth: T — effective local half-width used for normalized distance u. Must be > 0 for a full regression; zero triggers constant-average fallback.
  • h1: T — a tiny fraction of bandwidth below which kernel weight is forced to 1. This avoids numerical cancellation for extremely close points.
  • h9: T — slightly less than bandwidth (e.g. 0.999*h) used to truncate the effective neighbor scan and determine the rightmost point to include.

Functions and behaviors

  • fit_point(ctx): primary entry. Computes kernel ± robustness weights, normalizes them, and runs a weighted linear least-squares fit evaluated at x_current. If weights sum to zero the configured fallback is used. If the weighted x-variance is too small, the fitter falls back to the weighted mean.
  • compute_weights(…): fast, streaming weight computation that scans from left to right, short-circuits outside h9, applies h1 fast-path, and multiplies by robustness weights when requested. Returns the (unnormalized) total weight for the scanned region.
  • find_rightmost_point(…): returns the largest index within h9 of x_current.
  • normalize_weights(…): in-place normalization over [left..=right]. Debug builds assert sum > 0; production code expects callers to handle zero-sum.
  • weighted_least_squares(…): numerically stable WLS for degree-1. Falls back to weighted average when denominator is below conservatively chosen tolerance (absolute and bandwidth-scaled relative terms).
  • compute_weighted_average(…): assumes weights already normalized and returns ∑ wᵢ vᵢ over [left..=right].

Debug & determinism

  • Debug-only asserts check sorted x and buffer lengths; they do not change release behavior. Median/selection helpers used elsewhere prefer select_nth_unstable for performance (linear-time, not stable ordering).

Production recommendations

  • Pre-sort and deduplicate x/y upstream for reproducible window semantics.
  • Provide a pre-allocated weights buffer to avoid repeated allocations.
  • Choose an appropriate ZeroWeightFallback policy for your deployment: UseLocalMean for graceful smoothing, ReturnNone for strict failure modes.
  • Use delta-driven interpolation and robust weights at the builder layer to control performance vs. robustness trade-offs for large datasets.

Structs§

FitContext
Context for fitting a single point in LOWESS.
WeightParams
Parameters for weight computation.

Enums§

ZeroWeightFallback
Behavior to use when the computed total weight for a fit is zero.

Functions§

compute_effective_df
Trace of the hat matrix – effective degrees of freedom.
compute_leverage
Leverage of a single observation (diagonal of the hat matrix).
compute_residual_variance
Residual variance estimate (weighted).
compute_weighted_average
Simple weighted average (∑ wᵢ yᵢ) – assumes weights already normalized.
compute_weights
Compute kernel (and optional robustness) weights.
find_rightmost_point
Find the rightmost point whose distance ≤ h9.
fit_point
Fit a single point using local weighted regression.
locally_constant_fit
Locally constant (degree-0) regression – just a weighted average.
normalize_weights
Normalize weights in-place over [left, right].
weighted_least_squares
Weighted linear regression evaluated at x_current.
weighted_polynomial_fit
Higher-degree polynomial fit (currently only degree 1 is implemented).