fastLowess
High-performance parallel LOWESS (Locally Weighted Scatterplot Smoothing) for Rust — A high-level wrapper around the lowess crate that adds rayon-based parallelism and seamless ndarray integration.
[!IMPORTANT] For a minimal, single-threaded, and
no_stdversion, use baselowess.
Features
- Parallel by Default: Multi-core regression fits via rayon, achieving multiple orders of magnitude speedups on large datasets.
- ndarray Integration: Native support for
Array1<T>andArrayView1<T>. - Robust Statistics: MAD-based scale estimation and IRLS with Bisquare, Huber, or Talwar weighting.
- Uncertainty Quantification: Point-wise standard errors, confidence intervals, and prediction intervals.
- Optimized Performance: Delta optimization for skipping dense regions and streaming/online modes.
- Parameter Selection: Built-in cross-validation for automatic smoothing fraction selection.
Robustness Advantages
Built on the same core as lowess, this implementation is more robust than statsmodels due to:
MAD-Based Scale Estimation
We use Median Absolute Deviation (MAD) for scale estimation, which is breakdown-point-optimal:
s = median(|r_i - median(r)|)
Boundary Padding
We apply boundary policies (Extend, Reflect, Zero) at dataset edges to maintain symmetric local neighborhoods, preventing the edge bias common in other implementations.
Gaussian Consistency Factor
For precision in intervals, residual scale is computed using:
sigma = 1.4826 * MAD
Performance Advantages
Benchmarked against Python's statsmodels. Achieves 91-3914× faster performance across all tested scenarios. The parallel implementation ensures that even at extreme scales (100k points), processing remains sub-12ms.
Summary
| Category | Matched | Median Speedup | Mean Speedup |
|---|---|---|---|
| Scalability | 5 | 819× | 1482× |
| Pathological | 4 | 503× | 476× |
| Iterations | 6 | 491× | 496× |
| Fraction | 6 | 464× | 447× |
| Financial | 4 | 351× | 418× |
| Scientific | 4 | 345× | 404× |
| Genomic | 4 | 22× | 26× |
| Delta | 4 | 5× | 6.8× |
Top 10 Performance Wins
| Benchmark | statsmodels | fastLowess | Speedup |
|---|---|---|---|
| scale_100000 | 43.727s | 11.2ms | 3914× |
| scale_50000 | 11.160s | 5.74ms | 1946× |
| financial_10000 | 497.1ms | 0.59ms | 839× |
| scientific_10000 | 777.2ms | 0.93ms | 835× |
| scale_10000 | 663.1ms | 0.81ms | 819× |
| clustered | 267.8ms | 0.48ms | 554× |
| scale_5000 | 229.9ms | 0.42ms | 554× |
| fraction_0.1 | 227.9ms | 0.42ms | 542× |
| fraction_0.05 | 197.2ms | 0.37ms | 536× |
| financial_5000 | 170.9ms | 0.32ms | 536× |
Check Benchmarks for detailed comparisons.
Installation
Add this to your Cargo.toml:
[]
= "0.2"
Quick Start
use *;
use Array1;
Builder Methods
use *;
new
// Smoothing span (0, 1]
.fraction
// Robustness iterations
.iterations
// Interpolation threshold
.delta
// Kernel selection
.weight_function
// Robustness method
.robustness_method
// Zero-weight fallback behavior
.zero_weight_fallback
// Boundary handling (for edge effects)
.boundary_policy
// Confidence intervals
.confidence_intervals
// Prediction intervals
.prediction_intervals
// Diagnostics
.return_diagnostics
.return_residuals
.return_robustness_weights
// Cross-validation (for parameter selection)
.cross_validate
// Convergence
.auto_converge
.max_iterations
// Execution mode
.adapter
// Parallelism
.parallel
// Build the model
.build?;
Result Structure
[!TIP] Using with ndarray: While the result struct uses
Vec<T>for maximum compatibility, you can effortlessly convert any field to anArray1usingArray1::from_vec(result.y).
Streaming Processing
For datasets that don't fit in memory:
use *;
let mut processor = new
.fraction
.iterations
.adapter
.parallel // Enable parallel chunk processing
.chunk_size
.overlap
.build?;
// Process data in chunks
for chunk in data_chunks
// Finalize processing
let final_result = processor.finalize?;
Online Processing
For real-time data streams:
use *;
let mut processor = new
.fraction
.iterations
.adapter
.parallel // Sequential for lowest per-point latency
.window_capacity
.build?;
// Process points as they arrive
for in data_stream
Parameter Selection Guide
Fraction (Smoothing Span)
- 0.1-0.3: Local, captures rapid changes (wiggly)
- 0.4-0.6: Balanced, general-purpose
- 0.7-1.0: Global, smooth trends only
- Default: 0.67 (2/3, Cleveland's choice)
- Use CV when uncertain
Robustness Iterations
- 0: Clean data, speed critical
- 1-2: Light contamination
- 3: Default, good balance (recommended)
- 4-5: Heavy outliers
- >5: Diminishing returns
Kernel Function
- Tricube (default): Best all-around, smooth, efficient
- Epanechnikov: Theoretically optimal MSE
- Gaussian: Very smooth, no compact support
- Uniform: Fastest, least smooth (moving average)
Delta Optimization
- None: Small datasets (n < 1000)
- 0.01 × range(x): Good starting point for dense data
- Manual tuning: Adjust based on data density
Examples
Check the examples directory for advanced usage:
MSRV
Rust 1.85.0 or later (2024 Edition).
Validation
Validated against:
- Python (statsmodels): Passed on 44 distinct test scenarios.
- Original Paper: Reproduces Cleveland (1979) results.
Check Validation for more information. Small variations in results are expected due to differences in scale estimation and padding.
Related Work
Contributing
Contributions are welcome! Please see the CONTRIBUTING.md file for more information.
License
Dual-licensed under AGPL-3.0 (Open Source) or Commercial License.
Contact <thisisamirv@gmail.com> for commercial inquiries.
References
- Cleveland, W.S. (1979). "Robust Locally Weighted Regression and Smoothing Scatterplots". JASA.
- Cleveland, W.S. (1981). "LOWESS: A Program for Smoothing Scatterplots". The American Statistician.