Module regression

Expand description

Local regression fitting for LOWESS smoothing.

This module implements the local weighted least-squares regression used by the HIGHER-level LOWESS pipeline. It provides a single-point fitter and supporting utilities (weight computation, normalization, fallback policies, and simple diagnostics) designed for robust production use.

Global expectations

Inputs x and y are numeric and aligned; many helpers assume x is sorted ascending. Debug-only assertions validate sorting but production code returns safe fallbacks for degenerate inputs.
All numeric tolerances are conservative to avoid panics; callers may perform stricter validation upstream for performance or determinism.

Primary parameters and flags

x: &[T] — sorted (recommended) independent variable values.
y: &[T] — dependent variable values aligned with x.
idx: usize — index of the target point to fit (0..n-1).
left/right: usize — inclusive window boundaries defining the local neighborhood used for the fit. These are clamped to [0, n-1] by helpers.
use_robustness: bool — when true, per-observation robustness weights (from IRLS) are multiplied with kernel weights before normalization.
robustness_weights: &[T] — per-observation multiplicative weights from a previous robustness pass. If not used, pass a slice of ones.
weights: &mut [T] — scratch buffer for computed (unnormalized) weights; must be length n (or at least cover the positions accessed). The buffer is normalized in-place prior to regression.
weight_fn: WeightFunction — kernel used to compute distance-based weights (Tricube, Epanechnikov, Gaussian, etc.). Bounded kernels support a fast short-circuit for |u| >= 1.
zero_weight_fallback: ZeroWeightFallback — policy applied when the local sum of computed weights is zero. Options:
- UseLocalMean — return the (unweighted) mean over [left..=right].
- ReturnOriginal — return y[idx].
- ReturnNone — propagate failure (caller decides).

WeightParams specifics

x_current: T — the x location being fitted.
bandwidth: T — effective local half-width used for normalized distance u. Must be > 0 for a full regression; zero triggers constant-average fallback.
h1: T — a tiny fraction of bandwidth below which kernel weight is forced to 1. This avoids numerical cancellation for extremely close points.
h9: T — slightly less than bandwidth (e.g. 0.999*h) used to truncate the effective neighbor scan and determine the rightmost point to include.

Functions and behaviors

fit_point(ctx): primary entry. Computes kernel ± robustness weights, normalizes them, and runs a weighted linear least-squares fit evaluated at x_current. If weights sum to zero the configured fallback is used. If the weighted x-variance is too small, the fitter falls back to the weighted mean.
compute_weights(…): fast, streaming weight computation that scans from left to right, short-circuits outside h9, applies h1 fast-path, and multiplies by robustness weights when requested. Returns the (unnormalized) total weight for the scanned region.
find_rightmost_point(…): returns the largest index within h9 of x_current.
normalize_weights(…): in-place normalization over [left..=right]. Debug builds assert sum > 0; production code expects callers to handle zero-sum.
weighted_least_squares(…): numerically stable WLS for degree-1. Falls back to weighted average when denominator is below conservatively chosen tolerance (absolute and bandwidth-scaled relative terms).
compute_weighted_average(…): assumes weights already normalized and returns ∑ wᵢ vᵢ over [left..=right].

Debug & determinism

Debug-only asserts check sorted x and buffer lengths; they do not change release behavior. Median/selection helpers used elsewhere prefer select_nth_unstable for performance (linear-time, not stable ordering).

Production recommendations

Pre-sort and deduplicate x/y upstream for reproducible window semantics.
Provide a pre-allocated weights buffer to avoid repeated allocations.
Choose an appropriate ZeroWeightFallback policy for your deployment: UseLocalMean for graceful smoothing, ReturnNone for strict failure modes.
Use delta-driven interpolation and robust weights at the builder layer to control performance vs. robustness trade-offs for large datasets.

Structs§

FitContext: Context for fitting a single point in LOWESS.
WeightParams: Parameters for weight computation.

Enums§

ZeroWeightFallback: Behavior to use when the computed total weight for a fit is zero.

Functions§

compute_effective_df: Trace of the hat matrix – effective degrees of freedom.
compute_leverage: Leverage of a single observation (diagonal of the hat matrix).
compute_residual_variance: Residual variance estimate (weighted).
compute_weighted_average: Simple weighted average (∑ wᵢ yᵢ) – assumes weights already normalized.
compute_weights: Compute kernel (and optional robustness) weights.
find_rightmost_point: Find the rightmost point whose distance ≤ h9.
fit_point: Fit a single point using local weighted regression.
locally_constant_fit: Locally constant (degree-0) regression – just a weighted average.
normalize_weights: Normalize weights in-place over [left, right].
weighted_least_squares: Weighted linear regression evaluated at x_current.
weighted_polynomial_fit: Higher-degree polynomial fit (currently only degree 1 is implemented).

Module regression

Module regression Copy item path

Structs§

Enums§

Functions§

Module regression