Module utils

Module utils 

Source
Expand description

Utility functions for LOWESS smoothing.

This module provides lightweight, production-safe helpers used throughout the LOWESS pipeline. The functions are intentionally defensive: they validate inputs, document preconditions, and return safe values on degenerate inputs rather than panicking in release builds.

Global expectations

  • Callers should prefer to pre-clean data (remove NaNs/infs) and sort x when windowing behavior depends on monotonic order. Several helpers assert correctness with debug-only checks but return safe outputs in production paths.

Key parameters and their semantics

  • x: slice of independent variable values. Many helpers assume x is sorted ascending; if not, use sort_by_x before window-based ops.
  • y: slice of dependent variable values; must be aligned with x.
  • frac: smoothing fraction in (0, 1]. Used to compute window sizes. validate_inputs will reject non-finite or out-of-range values.
  • window_size (usize): discrete neighborhood size computed from frac and n via calculate_window_size. Always at least 2 and at most n.
  • delta: interpolation distance threshold (Option<T> or T). When None calculate_delta returns a conservative default (~1% of x-range). A zero or negative delta disables the skip/fast-path logic.
  • left/right/current/idx: integer indices describing the local window boundaries; many window helpers return clamped values in [0, n-1].

Important functions (brief)

  • validate_inputs(x, y, frac): checks lengths, finiteness, minimum points, and fraction bounds. Returns a Result for early failure in callers.
  • validate_confidence_level(level): ensure level ∈ (0,1) for interval code.
  • validate_delta(delta): ensures delta ≥ 0 and finite.
  • sort_by_x: stable mapping of x/y to sorted order (returns new Vecs).
  • calculate_delta(delta, x_sorted): resolves optional delta to a numeric value (defaults to 1% of range for None).
  • calculate_window_size(n, frac): converts fractional span to integer window size with safe min/max clamping.
  • initialize_window / update_window: establish and slide local windows while keeping them valid for regression computations.
  • interpolate_gap: linear interpolation between two fitted points; used when delta indicates points can be filled in rather than re-fit.
  • skip_close_points: delta-driven fast-path to skip/refill sequences of points close to the last fitted x. Also handles identical-x ties by copying the last fitted value for stability.
  • normalize_weights / normalize_all_weights: numeric-safe normalization with uniform fallback when total weight is (near) zero.
  • compute_range / find_rightmost_point: range and threshold helpers used by windowing and delta logic.
  • compute_weighted_average / is_effectively_zero: small numeric utilities used widely across fitting and diagnostics code.

Production recommendations

  • Validate and deduplicate inputs upstream for large datasets. The helpers here validate and handle edge cases, but upstream cleaning improves determinism and performance.
  • Use delta (default ~1% of range) for dense inputs to accelerate fitting.
  • Choose frac in (0, 1]; cross-validation at the builder layer is recommended for data-driven selection.

Functions§

calculate_delta
Calculate delta parameter for interpolation optimization.
calculate_window_size
Calculate window size from fraction and number of points.
compute_range
Compute the range of x values.
compute_weighted_average
Compute weighted average over a range.
find_rightmost_point
Find index of rightmost point within a distance threshold.
initialize_window
Initialize window boundaries for a given point.
interpolate_gap
Linearly interpolate smoothed values between fitted points.
is_effectively_zero
Check if a value is effectively zero.
is_sorted
Check if x values are sorted in ascending order.
normalize_all_weights
Normalize entire weight array to sum to 1.
normalize_weights
Normalize weights to sum to 1 over a specified range.
skip_close_points
Skip points that are within delta distance of the last fitted point.
sort_by_x
Sort data by x values and return sorted (x, y) pairs.
update_window
Update window boundaries to maintain optimal window around current point.
validate_confidence_level
Validate confidence level parameter.
validate_delta
Validate delta parameter.
validate_inputs
Validate input arrays for LOWESS smoothing.