# fastLoess
[](https://crates.io/crates/fastLoess)
[](https://docs.rs/fastLoess)
[](LICENSE-MIT)
[](https://www.rust-lang.org)
**High-performance parallel LOESS (Locally Estimated Scatterplot Smoothing) for Rust** — A high-level wrapper around the [`loess-rs`](https://github.com/thisisamirv/loess-rs) crate that adds rayon-based parallelism and seamless ndarray integration.
> [!IMPORTANT]
> For a minimal, single-threaded, and `no_std` version, use base [`loess-rs`](https://github.com/thisisamirv/loess-rs).
## How LOESS works
LOESS creates smooth curves through scattered data using local weighted neighborhoods:

## LOESS vs. LOWESS
| **Polynomial Degree** | Linear, Quadratic, Cubic, Quartic | Linear (Degree 1) |
| **Dimensions** | Multivariate (n-D support) | Univariate (1-D only) |
| **Flexibility** | High (Distance metrics) | Standard |
| **Complexity** | Higher (Matrix inversion) | Lower (Weighted average/slope) |
LOESS can fit higher-degree polynomials for more complex data:

LOESS can also handle multivariate data (n-D), while LOWESS is limited to univariate data (1-D):

> [!TIP]
> **Note:** For a simple, lightweight, and fast **LOWESS** implementation, use [`lowess`](https://github.com/thisisamirv/lowess) crate.
## Features
- **Robust Statistics**: IRLS with Bisquare, Huber, or Talwar weighting for outlier handling.
- **Multidimensional Smoothing**: Support for n-D data with customizable distance metrics (Euclidean, Manhattan, etc.).
- **Flexible Fitting**: Linear, Quadratic, Cubic, and Quartic local polynomials.
- **Uncertainty Quantification**: Point-wise standard errors, confidence intervals, and prediction intervals.
- **Optimized Performance**: Interpolation surface with Tensor Product Hermite interpolation and streaming/online modes for large or real-time datasets.
- **Parameter Selection**: Built-in cross-validation for automatic smoothing fraction selection.
- **Flexibility**: Multiple weight kernels (Tricube, Epanechnikov, etc.) and `no_std` support (requires `alloc`).
- **Validated**: Numerical twin of R's `stats::loess` with exact match (< 1e-12 diff).
## Performance
Benchmarked against R's `stats::loess`. The latest benchmarks comparing **Serial** vs **Parallel (Rayon)** execution modes show that the parallel implementation correctly leverages multiple cores to provide additional speedups, particularly for computationally heavier tasks (high dimensions, larger datasets).
Overall, Rust implementations achieve **3x to 54x** speedups over R.
### Comparison: R vs Rust (Serial) vs Rust (Parallel)
The table below shows the execution time and speedup relative to R.
| **Dimensions** | | | |
| 1d_linear | 4.18ms | 7.2x | 8.1x |
| 2d_linear | 13.24ms | 6.5x | 10.1x |
| 3d_linear | 28.37ms | 7.9x | 13.6x |
| **Pathological** | | | |
| clustered | 19.70ms | 15.7x | 21.5x |
| constant_y | 13.61ms | 13.6x | 17.5x |
| extreme_outliers | 23.55ms | 10.3x | 11.7x |
| high_noise | 34.96ms | 19.9x | 28.0x |
| **Polynomial Degree** | | | |
| degree_constant | 8.50ms | 10.0x | 13.5x |
| degree_linear | 13.47ms | 16.2x | 21.4x |
| degree_quadratic | 19.07ms | 23.3x | 29.7x |
| **Scalability** | | | |
| scale_1000 | 1.09ms | 4.3x | 3.7x |
| scale_5000 | 8.63ms | 7.2x | 8.2x |
| scale_10000 | 28.68ms | 10.4x | 14.5x |
| **Real-world Scenarios** | | | |
| financial_1000 | 1.11ms | 4.8x | 4.7x |
| financial_5000 | 8.28ms | 7.6x | 9.2x |
| genomic_5000 | 8.27ms | 6.7x | 7.5x |
| scientific_5000 | 11.23ms | 6.8x | 10.1x |
| **Parameter Sensitivity** | | | |
| fraction_0.67 | 44.96ms | 54.0x | 54.1x |
| iterations_10 | 23.31ms | 10.9x | 11.8x |
*Note: "Rust (Parallel)" corresponds to the optimized CPU backend using Rayon.*
### Key Takeaways
1. **Parallel Wins on Load**: For computationally intensive tasks (e.g., `3d_linear`, `high_noise`, `scientific_5000`, `scale_10000`), the parallel backend provides significant additional speedup over the serial implementation (e.g., **13.6x vs 7.9x** for 3D data).
2. **Overhead on Small Data**: For very small or fast tasks (e.g., `scale_1000`, `financial_1000`), the serial implementation is comparable or slightly faster, indicating that thread management overhead is visible but minimal (often < 0.05ms difference).
3. **Consistent Superiority**: Both Rust implementations consistently outperform R, usually by an order of magnitude.
### Recommendation
- **Default to Parallel**: The overhead for small datasets is negligible (microseconds), while the gains for larger or more complex datasets are substantial (doubling the speedup factor in some cases).
- **Use Serial for Tiny Batches**: If processing millions of independent tiny datasets (< 1000 points) where calling `fit()` repeatedly, the serial backend might save thread pool overhead.
Check [Benchmarks](https://github.com/thisisamirv/fastLoess/tree/bench/benchmarks) for detailed results and reproducible benchmarking code.
## Robustness Advantages
This implementation includes several robustness features beyond R's `loess`:
### MAD-Based Scale Estimation
Uses **MAD-based scale estimation** for robustness weight calculations:
```text
MAD is a **breakdown-point-optimal** estimator—it remains valid even when up to 50% of data are outliers, compared to the median of absolute residuals used by some other implementations.
Median Absolute Residual (MAR), which is the default Cleveland's choice, is also available through the `scaling_method` parameter.
### Configurable Boundary Policies
R's `loess` uses asymmetric windows at data boundaries, which can introduce edge bias. This implementation offers configurable **boundary policies** to mitigate this:
- **Extend** (default): Pad with constant values for symmetric windows
- **Reflect**: Mirror data at boundaries (best for periodic data)
- **Zero**: Pad with zeros (signal processing applications)
- **NoBoundary**: Original R behavior (no padding)
### Boundary Degree Fallback
When using `Interpolation` mode with higher polynomial degrees (Quadratic, Cubic), vertices outside the tight data bounds can produce unstable extrapolation. This implementation offers a configurable **boundary degree fallback**:
- **`true`** (default): Reduce to Linear fits at boundary vertices (more stable)
- **`false`**: Use full requested degree everywhere (matches R exactly)
## Validation
The Rust `fastLoess` crate is a **numerical twin** of R's `loess` implementation:
| **Accuracy** | ✅ EXACT MATCH | Max diff < 1e-12 across all scenarios |
| **Consistency** | ✅ PERFECT | 20/20 scenarios pass with strict tolerance |
| **Robustness** | ✅ VERIFIED | Robust smoothing matches R exactly |
Check [Validation](https://github.com/thisisamirv/fastLoess/tree/bench/validation) for detailed scenario results.
## Installation
Add this to your `Cargo.toml`:
```toml
[dependencies]
fastLoess = "0.1"
```
For `no_std` environments:
```toml
[dependencies]
fastLoess = { version = "0.1", default-features = false }
```
## Quick Start
```rust
use fastLoess::prelude::*;
fn main() -> Result<(), LoessError> {
let x = vec![1.0, 2.0, 3.0, 4.0, 5.0];
let y = vec![2.0, 4.1, 5.9, 8.2, 9.8];
// Build and fit model
let result = Loess::new()
.fraction(0.5) // Use 50% of data for each local fit
.iterations(3) // 3 robustness iterations
.adapter(Batch)
.build()?
.fit(&x, &y)?;
println!("{}", result);
Ok(())
}
```
```text
Summary:
Data points: 5
Fraction: 0.5
Smoothed Data:
X Y_smooth
--------------------
1.00 2.00000
2.00 4.10000
3.00 5.90000
4.00 8.20000
5.00 9.80000
```
## Builder Methods
All builder parameters have sensible defaults. You only need to specify what you want to change.
```rust
use fastLoess::prelude::*;
Loess::new()
// Smoothing span (0, 1] - default: 0.67
.fraction(0.5)
// Polynomial degree - default: Linear
.degree(Quadratic)
// Number of dimensions - default: 1
.dimensions(2)
// Distance metric - default: Euclidean
.distance_metric(Manhattan)
// Robustness iterations - default: 3
.iterations(5)
// Kernel selection - default: Tricube
.weight_function(Epanechnikov)
// Robustness method - default: Bisquare
.robustness_method(Huber)
// Boundary handling - default: Extend
.boundary_policy(Reflect)
// Boundary degree fallback - default: true
.boundary_degree_fallback(true)
// Confidence intervals (Batch only)
.confidence_intervals(0.95)
// Prediction intervals (Batch only)
.prediction_intervals(0.95)
// Include diagnostics
.return_diagnostics()
.return_residuals()
.return_robustness_weights()
// Cross-validation (Batch only)
.cross_validate(KFold(5, &[0.3, 0.5, 0.7]).seed(123))
// Auto-convergence
.auto_converge(1e-4)
// Interpolation settings
.surface_mode(Interpolation)
// Interpolation cell size - default: 0.2
.cell(0.2)
// Execution mode
.adapter(Batch)
// Parallelism
.parallel(true)
// Build the model
.build()?;
```
## Result Structure
```rust
pub struct LoessResult<T> {
/// Sorted x values (independent variable)
pub x: Vec<T>,
/// Smoothed y values (dependent variable)
pub y: Vec<T>,
/// Point-wise standard errors of the fit
pub standard_errors: Option<Vec<T>>,
/// Confidence interval bounds (if computed)
pub confidence_lower: Option<Vec<T>>,
pub confidence_upper: Option<Vec<T>>,
/// Prediction interval bounds (if computed)
pub prediction_lower: Option<Vec<T>>,
pub prediction_upper: Option<Vec<T>>,
/// Residuals (y - fit)
pub residuals: Option<Vec<T>>,
/// Final robustness weights from outlier downweighting
pub robustness_weights: Option<Vec<T>>,
/// Detailed fit diagnostics (RMSE, R^2, Effective DF, etc.)
pub diagnostics: Option<Diagnostics<T>>,
/// Number of robustness iterations actually performed
pub iterations_used: Option<usize>,
/// Smoothing fraction used (optimal if selected via CV)
pub fraction_used: T,
/// RMSE scores for each fraction tested during CV
pub cv_scores: Option<Vec<T>>,
}
```
> [!TIP]
> **Using with ndarray:** While the result struct uses `Vec<T>` for maximum compatibility, you can effortlessly convert any field to an `Array1` using `Array1::from_vec(result.y)`.
## Streaming Processing
For datasets that don't fit in memory:
```rust
let mut processor = Loess::new()
.fraction(0.3)
.iterations(2)
.adapter(Streaming)
.chunk_size(1000)
.overlap(100)
.build()?;
// Process data in chunks
let result1 = processor.process_chunk(&chunk1_x, &chunk1_y)?;
let result2 = processor.process_chunk(&chunk2_x, &chunk2_y)?;
// Finalize to get remaining buffered data
let final_result = processor.finalize()?;
```
## Online Processing
For real-time data streams:
```rust
let mut processor = Loess::new()
.fraction(0.2)
.iterations(1)
.adapter(Online)
.window_capacity(100)
.build()?;
// Process points as they arrive
for i in 1..=10 {
let x = i as f64;
let y = 2.0 * x + 1.0;
if let Some(output) = processor.add_point(&[x], y)? {
println!("Smoothed: {:.2}", output.smoothed);
}
}
```
## Parameter Selection Guide
### Fraction (Smoothing Span)
- **0.1-0.3**: Fine detail, may be noisy
- **0.3-0.5**: Moderate smoothing (good for most cases)
- **0.5-0.7**: Heavy smoothing, emphasizes trends
- **0.7-1.0**: Very smooth, may over-smooth
- **Default: 0.67** (Cleveland's choice)
### Robustness Iterations
- **0**: No robustness (fastest, sensitive to outliers)
- **1-3**: Light to moderate robustness (recommended)
- **4-6**: Strong robustness (for contaminated data)
- **7+**: Diminishing returns
### Polynomial Degree
- **Constant**: Local weighted mean (smoothing only)
- **Linear** (default): Standard LOESS, good bias-variance balance
- **Quadratic**: Better for peaks/valleys, higher variance
- **Cubic/Quartic**: Specialized high-order fitting
### Kernel Function
- **Tricube** (default): Best all-around, Cleveland's original choice
- **Epanechnikov**: Theoretically optimal MSE
- **Gaussian**: Maximum smoothness, no compact support
- **Uniform**: Fastest, least smooth (moving average)
### Boundary Policy
- **Extend** (default): Pad with constant values
- **Reflect**: Mirror data at boundaries (for periodic/symmetric data)
- **Zero**: Pad with zeros (signal processing)
- **NoBoundary**: Original Cleveland behavior
> **Note:** For nD data, `Extend` defaults to `NoBoundary` to preserve regression accuracy.
## Examples
```bash
cargo run --example batch_smoothing
cargo run --example online_smoothing
cargo run --example streaming_smoothing
```
## MSRV
Rust **1.85.0** or later (2024 Edition).
## Contributing
Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
## License
Licensed under either of
- Apache License, Version 2.0
([LICENSE-APACHE](LICENSE-APACHE) or <http://www.apache.org/licenses/LICENSE-2.0>)
- MIT license
([LICENSE-MIT](LICENSE-MIT) or <http://opensource.org/licenses/MIT>)
at your option.
## References
- Cleveland, W.S. (1979). "Robust Locally Weighted Regression and Smoothing Scatterplots". *Journal of the American Statistical Association*.
- Cleveland, W.S. & Devlin, S.J. (1988). "Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting". *Journal of the American Statistical Association*.