lowess
High-performance LOWESS (Locally Weighted Scatterplot Smoothing) for Rust — 40-500× faster than Python's statsmodels with robust statistics, confidence intervals, and parallel execution.
Why This Crate?
- ⚡ Blazingly Fast: 40-500× faster than statsmodels, sub-millisecond smoothing for 1000 points
- 🎯 Production-Ready: Comprehensive error handling, numerical stability, extensive testing
- 📊 Feature-Rich: Confidence/prediction intervals, multiple kernels, cross-validation
- 🚀 Scalable: Parallel execution, streaming mode, delta optimization
- 🔬 Scientific: Validated against R and Python implementations
- 🛠️ Flexible:
no_stdsupport, ndarray integration, multiple robustness methods
Quick Start
use Lowess;
let x = vec!;
let y = vec!;
// Basic smoothing
let result = new
.fraction
.fit
.unwrap;
println!;
Installation
[]
= "0.2"
# With optional features
= { = "0.2", = ["parallel", "ndarray"] }
Features at a Glance
| Feature | Description | Use Case |
|---|---|---|
| Robust Smoothing | IRLS with Bisquare/Huber/Talwar weights | Outlier-contaminated data |
| Confidence Intervals | Point-wise standard errors & bounds | Uncertainty quantification |
| Cross-Validation | Auto-select optimal fraction | Unknown smoothing parameter |
| Multiple Kernels | Tricube, Epanechnikov, Gaussian, etc. | Different smoothness profiles |
| Parallel Execution | Multi-threaded via Rayon | Large datasets (n > 1000) |
| Streaming Mode | Constant memory usage | Very large datasets |
| Delta Optimization | Skip dense regions | 10× speedup on dense data |
Common Use Cases
1. Robust Smoothing (Handle Outliers)
let result = new
.fraction
.iterations // Robust iterations
.with_robustness_weights // Return outlier weights
.fit?;
// Check which points were downweighted
if let Some = result.robustness_weights
2. Uncertainty Quantification
let result = new
.fraction
.with_confidence_intervals
.with_prediction_intervals
.fit?;
// Plot confidence bands
for i in 0..x.len
3. Automatic Parameter Selection
// Let cross-validation find the optimal smoothing fraction
let result = new
.cross_validate
.fit?;
println!;
println!;
4. Large Dataset Optimization
// Enable all performance optimizations
let result = new
.fraction
.delta_auto // Skip dense regions
.parallel // Multi-threaded (requires "parallel" feature)
.fit?;
5. Production Monitoring
let result = new
.fraction
.iterations
.with_diagnostics
.fit?;
if let Some = result.diagnostics
Performance Benchmarks
Comparison against Python's statsmodels on typical workloads:
| Dataset Size | statsmodels | Rust (sequential) | Rust (parallel) | Sequential Speedup | Parallel Speedup |
|---|---|---|---|---|---|
| 100 points | 2.4 ms | 0.09 ms | 0.10 ms | 27× | 24× |
| 1,000 points | 32.5 ms | 0.80 ms | 0.81 ms | 41× | 40× |
| 5,000 points | 332 ms | 4.1 ms | 4.1 ms | 81× | 81× |
| 10,000 points | 1,073 ms | 8.2 ms | 8.2 ms | 131× | 245× |
Performance Summary
- Sequential mode: 35-48× faster on average across all test scenarios
- Parallel mode: 51-76× faster on average, with 1.5-2× additional speedup from parallelization
- Pathological cases (clustered data, extreme outliers): 260-525× faster
- Small fractions (0.1 span): 80-114× faster due to localized computation
- Robustness iterations: 38-77× faster with consistent scaling across iteration counts
When Parallelization Helps Most
Parallel execution shows the greatest gains on:
- Large datasets (>10,000 points): Up to 245× vs 131× sequential
- Multiple robustness iterations: 70-77× speedup vs statsmodels
- Small span values: 114× speedup for fraction=0.1
- Cross-validation: Linear scaling with available CPU cores
For datasets <1,000 points, sequential mode is typically sufficient as parallelization overhead outweighs benefits.
Benchmarks conducted on dual Intel Xeon Platinum 8562Y+ (64 cores total, 2×32 cores @ 4.1 GHz) running Red Hat Enterprise Linux 8.10. See validation/ directory for detailed methodology and reproducible test scripts.
API Overview
Builder Methods
new
// Core parameters
.fraction // Smoothing span (0, 1], default: 0.67
.iterations // Robustness iterations, default: 0
.delta // Interpolation threshold
.delta_auto // Auto-calculate delta
// Kernel selection
.weight_function // Default
// Robustness method
.robustness_method // Default
// Intervals & diagnostics
.with_confidence_intervals
.with_prediction_intervals
.with_both_intervals
.with_diagnostics
.with_robustness_weights
// Parameter selection
.cross_validate
.cross_validate_kfold
.cross_validate_loocv
// Convergence
.auto_converge
.max_iterations
// Performance
.parallel // Requires "parallel" feature
.fit?
Result Structure
Advanced Features
Streaming Processing
For datasets too large to fit in memory:
use ;
let config = new.fraction.iterations;
let mut streaming = new; // Chunk size
for chunk in data_chunks
Online/Incremental Updates
Real-time smoothing with sliding window:
use ;
let config = new.fraction;
let mut online = new; // Window size
for in x.iter.zip
ndarray Integration
use Lowess;
use Array1;
let x: = linspace;
let y: = x.mapv + 0.1;
let result = new
.fraction
.fit?;
// Convert back to ndarray
let smoothed = from;
Parameter Selection Guide
Fraction (Smoothing Span)
- 0.1-0.3: Local, captures rapid changes (wiggly)
- 0.4-0.6: Balanced, general-purpose
- 0.7-1.0: Global, smooth trends only
- Default: 0.67 (2/3, Cleveland's choice)
- Use CV when uncertain
Robustness Iterations
- 0: Clean data, speed critical
- 1-2: Light contamination
- 3: Default, good balance (recommended)
- 4-5: Heavy outliers
- >5: Diminishing returns
Kernel Function
- Tricube (default): Best all-around, smooth, efficient
- Epanechnikov: Theoretically optimal MSE
- Gaussian: Very smooth, no compact support
- Uniform: Fastest, least smooth (moving average)
Delta Optimization
- None: Small datasets (n < 1000)
- Auto: Let the algorithm decide (recommended)
- Manual: ~0.01 × range(x) for dense data
Error Handling
use ;
match new.fit
Feature Flags
std(default): Standard library supportparallel: Enable Rayon-based parallelization (addsrayondependency)ndarray: Enable ndarray integration (addsndarraydependency)full: Enable all optional features
# Minimal (no_std with alloc)
= { = "0.2", = false }
# All features
= { = "0.2", = ["full"] }
Validation
This implementation has been extensively validated against:
- R's stats::lowess: Numerical agreement to machine precision
- Python's statsmodels: Validated on 44 test scenarios
- Cleveland's original paper: Reproduces published examples
See validation/ directory for cross-language comparison scripts.
MSRV (Minimum Supported Rust Version)
Rust 1.85.0 or later.
Contributing
Contributions welcome! See CONTRIBUTING.md for:
- Bug reports and feature requests
- Pull request guidelines
- Development workflow
- Testing requirements
License
MIT License - see LICENSE file.
References
Original papers:
-
Cleveland, W.S. (1979). "Robust Locally Weighted Regression and Smoothing Scatterplots". Journal of the American Statistical Association, 74(368): 829-836. DOI:10.2307/2286407
-
Cleveland, W.S. (1981). "LOWESS: A Program for Smoothing Scatterplots by Robust Locally Weighted Regression". The American Statistician, 35(1): 54.
Related implementations:
Citation
Author
Amir Valizadeh
📧 thisisamirv@gmail.com
🔗 GitHub
Keywords: LOWESS, LOESS, local regression, nonparametric regression, smoothing, robust statistics, time series, bioinformatics, genomics, signal processing