Crate lowess

Expand description

LOWESS (Locally Weighted Scatterplot Smoothing) for Rust.

This crate provides a fast, robust, and production-oriented LOWESS (locally weighted scatterplot smoothing) implementation. It is intended for analysis pipelines, batch jobs, and services where determinism, safety, observability, and configurable performance trade-offs are required.

Key capabilities

Robust smoothing using iteratively reweighted least squares (IRLS).
Multiple kernel choices (tricube default, epanechnikov, gaussian, uniform, quartic, cosine, triangle).
Per-point standard errors, confidence intervals (mean) and prediction intervals (new observations).
Automatic fraction selection via cross-validation (simple RMSE, k-fold, LOOCV) and optional parallel CV.
Delta-based interpolation fast-path for dense inputs to reduce compute.
Memory-efficient variants for datasets too large to fit in memory via the streaming/online/chunked backends.
Optional parallel execution (feature = “parallel”) via Rayon.
Optional ndarray convenience adapters (feature = “ndarray”).
no_std-compatible with alloc for embedded or constrained environments.

Concepts and parameters (summary)

x, y: aligned input slices of the independent and dependent variables.
- Caller responsibility: remove NaNs/infs and prefer pre-sorting x for reproducible window semantics. The builder also offers helpers to sort.
fraction (span): smoothing fraction ∈ (0, 1]. Controls local window size.
- Typical default: 0.67. Smaller fractions produce less smoothing.
iterations / niter: robustness IRLS iterations (usize). 0 disables IRLS.
- Typical values: 0 (fast), 2–5 (robust). Auto-convergence can stop early.
delta: interpolation distance threshold (T or Option<T>).
- delta <= 0 disables interpolation (fit every point).
- None resolves to a conservative default (≈1% of x-range).
- Use delta on dense inputs to interpolate between anchor fits and save time.
weight_function: kernel choice. Tricube recommended for general use.
interval_level and interval_type: compute confidence and/or prediction intervals at the specified probability (e.g. 0.95).
cv_fractions and cv_method: candidate fractions and CV strategy for automatic selection. Returns cv_scores on success.
auto_convergence and max_iterations: tolerance and cap for stopping IRLS early based on maximum change in fitted values.
compute_diagnostics / compute_residuals / compute_robustness_weights: booleans controlling what additional outputs are produced.
parallel feature: enable multithreaded CV and fitting for large n.
zero_weight_fallback: policy for neighborhoods with zero kernel weight:
- UseLocalMean, ReturnOriginal, or ReturnNone (propagate failure).

Outputs (LowessResult)

x: sorted independent variable values (builder sorts inputs).
y: smoothed values aligned with x.
standard_errors: per-point SE when requested.
confidence_lower/upper, prediction_lower/upper: optional interval bounds.
residuals: optional residual vector.
robustness_weights: optional final IRLS weights.
diagnostics: optional struct with RMSE, MAE, R², AIC, AICc, effective df.
iterations_used, fraction_used, cv_scores: metadata for monitoring/telemetry.

Error handling

Returns Result<T, LowessError> with explicit variants for common failures: EmptyInput, MismatchedInputs, InvalidFraction, InvalidDelta, InvalidNumericValue, TooFewPoints, InvalidConfidenceLevel.
Functions are defensive: degenerate situations return safe defaults rather than panicking in release builds. Debug assertions exist for development.

Determinism & numeric safety

Sorting, stable default choices, and avoidance of global mutable state provide deterministic outputs for a fixed configuration and inputs.
Numerics: conservative fallbacks for near-zero scales, uniform fallback when all kernel weights evaluate to zero, and clamped tuned-scales avoid divide-by-zero issues.

Performance & operational guidance

For large datasets, enable “parallel” and pre-allocate buffers to reduce allocation overhead across repeated calls.
Use delta for dense inputs to reduce per-point regression costs.
Use cross-validation sparingly on very large candidate grids; prefer coarse-to-fine search or parallel CV when available.
Monitor diagnostics (RMSE, effective sample size, count_downweighted) in production to detect pathological fits.

Examples

Basic smoothing

use lowess::Lowess;
let x = vec![1.0, 2.0, 3.0, 4.0, 5.0];
let y = vec![2.0, 4.1, 5.9, 8.2, 9.8];
let result = Lowess::new().fraction(0.5).iterations(3).fit(&x, &y).unwrap();
println!("Smoothed y: {:?}", result.y);

With 95% confidence intervals and diagnostics

use lowess::Lowess;
let x = vec![1.0, 2.0, 3.0, 4.0, 5.0];
let y = vec![2.0, 4.1, 5.9, 8.2, 9.8];
let result = Lowess::new()
    .fraction(0.5)
    .with_confidence_intervals(0.95)
    .with_all_diagnostics()
    .fit(&x, &y)
    .unwrap();
println!("RMSE: {:?}", result.diagnostics.map(|d| d.rmse));

Cross-validated fraction selection (parallel-enabled for large n)

use lowess::Lowess;
let x = vec![1.0, 2.0, 3.0, 4.0, 5.0];
let y = vec![2.0, 4.1, 5.9, 8.2, 9.8];
let candidate = vec![0.2, 0.3, 0.5, 0.7];
let result = Lowess::new()
    .cross_validate(&candidate)
    .fit(&x, &y)
    .unwrap();
println!("Selected fraction: {}", result.fraction_used);

Streaming / online / chunked processing (use ProcessingMode)

use lowess::{Lowess, ProcessingMode};
// example data
let x = (0..50).map(|i| i as f64).collect::<Vec<_>>();
let y = x.iter().map(|v| 2.0 * v + 1.0).collect::<Vec<_>>();

// Obtain a processing-mode variant. Avoid calling `.build()` on the
// wrapped mode-specific builder inside doctests because some backends
// validate inter-dependent defaults (e.g. overlap < chunk_size) which
// can cause doctest failures.
let variant = Lowess::new()
    .fraction(0.5)
    .iterations(1)
    .for_mode(ProcessingMode::Streaming)
    .chunk_size(10);

match variant {
    // Batch contains the standard Lowess builder — call `.fit(...)` directly.
    lowess::builder::ProcessingVariant::Batch(batch_builder) => {
        let result = batch_builder.fit(&x, &y).expect("fit");
        println!("Batch result length: {}", result.x.len());
    }
    // Streaming contains the streaming-mode builder — in real code call
    // `stream_builder.build()?` to obtain the processor and use its methods.
    lowess::builder::ProcessingVariant::Streaming(_stream_builder) => {
        // mode-specific builder is available here; avoid calling `.build()` in doctests.
    }
    _ => {
        // Online / Chunked variants follow the same pattern.
    }
}

Auto-convergence example

use lowess::Lowess;
let x = vec![1.0, 2.0, 3.0, 4.0];
let y = vec![2.0, 4.1, 5.9, 8.2];
let r = Lowess::new().auto_converge(1e-4).max_iterations(20).fit(&x, &y).unwrap();
if let Some(iters) = r.iterations_used { println!("Converged after {}", iters); }

API tips and best practices

Pre-clean inputs (remove NaNs/infs) and sort x for deterministic windowing.
Choose sensible defaults in the builder for production: use delta for dense data, modest iterations (2–3) for robustness, and enable diagnostics in scheduled batch jobs.
When using parallel execution, benchmark recommended_chunk_size() and pre-allocate per-call buffers for throughput-sensitive workloads.

See module-level documentation (builder, core, regression, kernel, confidence, robustness, utils, parallel) for function-level argument descriptions, return conventions, and panic/assert behavior.

Re-exports§

pub use builder::Diagnostics;
pub use builder::IntervalType;
pub use builder::LowessBuilder as Lowess;
pub use builder::LowessResult;
pub use builder::ProcessingMode;
pub use builder::ProcessingVariant;
pub use kernel::WeightFunction;
pub use kernel::WeightFunctionInfo;

Modules§

builder: LOWESS Builder Pattern
confidence: Confidence intervals, prediction intervals and standard error computation for LOWESS
core: Core LOWESS algorithm implementation.
kernel: Kernel weight functions for LOWESS smoothing.
regression: Local regression fitting for LOWESS smoothing.
robustness: Robustness weighting for outlier-resistant LOWESS smoothing.
streaming: Streaming and online LOWESS for very large datasets.
utils: Utility functions for LOWESS smoothing.

Enums§

LowessError: LOWESS error types.

Functions§

lowess: Perform LOWESS smoothing with default parameters.
lowess_robust: Perform robust LOWESS smoothing (5 iterations).
lowess_with_fraction: Perform LOWESS smoothing with custom fraction.

Type Aliases§

Result: Result type for LOWESS operations.

Crate lowess

Crate lowess Copy item path

Re-exports§

Modules§

Enums§

Functions§

Type Aliases§

Crate lowess