anofox_forecast/lib.rs
1//! # anofox-forecast
2//!
3//! Time series forecasting library for Rust.
4//!
5//! Provides 35+ forecasting models including ARIMA, ETS, Theta,
6//! and baseline methods, along with seasonality decomposition (STL/MSTL),
7//! changepoint detection, and outlier detection.
8//!
9//! For comprehensive periodicity detection, see the
10//! [fdars](https://crates.io/crates/fdars-core) crate.
11//!
12//! # Architecture Decisions
13//!
14//! ## Cross-Validation Split (`ts_cv_split`)
15//!
16//! Time series cross-validation with data leakage prevention is implemented in the
17//! [forecast-extension](https://github.com/DataZooDE/forecast-extension) DuckDB extension
18//! rather than in this crate. This section documents the rationale.
19//!
20//! ### Why CV Split is Not Part of `TimeSeries`
21//!
22//! The [`TimeSeries`](crate::core::TimeSeries) struct represents a single time series with
23//! its values, timestamps, and metadata. Cross-validation splitting was considered as a
24//! method on `TimeSeries` but was intentionally kept separate for these reasons:
25//!
26//! 1. **Cross-series coordination**: Fold generation is a global operation across multiple
27//! series, not a per-series operation. CV requires consistent fold boundaries across all
28//! series in a dataset.
29//!
30//! 2. **External feature handling**: Unknown future features like `stockout` flags or
31//! `segment_id` changes are external columns that don't belong in the series data model.
32//! These require schema-aware handling at the data layer.
33//!
34//! 3. **Data manipulation efficiency**: DuckDB's vectorized execution is more efficient for
35//! the bulk data operations (filtering, joining, filling) that CV split requires.
36//!
37//! 4. **Schema flexibility**: SQL macros can handle arbitrary column schemas without
38//! requiring Rust to know the schema at compile time.
39//!
40//! ### Component Distribution
41//!
42//! | Component | Location | Rationale |
43//! |-----------|----------|-----------|
44//! | Fold generation | DuckDB extension | Cross-series coordination, global operation |
45//! | Train/test assignment | SQL/DuckDB | Simple comparison, vectorized execution |
46//! | Unknown feature filling | Rust UDF via DuckDB | Per-series state tracking |
47//! | Orchestration | SQL macro | Flexible, schema-agnostic |
48//!
49//! ### Using CV Functionality
50//!
51//! For time series cross-validation with data leakage prevention, use the `ts_cv_split`
52//! function from the [forecast-extension](https://github.com/DataZooDE/forecast-extension):
53//!
54//! ```sql
55//! -- Example: Generate CV folds with unknown feature handling
56//! SELECT * FROM ts_cv_split(
57//! my_data,
58//! n_splits := 3,
59//! horizon := 7,
60//! unknown_features := ['stockout', 'segment_id']
61//! );
62//! ```
63//!
64//! See [forecast-extension#54](https://github.com/DataZooDE/forecast-extension/issues/54)
65//! for implementation details.
66//!
67//! ### Future Considerations
68//!
69//! If per-series CV semantics become necessary in Rust (e.g., for standalone use without
70//! DuckDB), the fold generation logic could be extracted:
71//!
72//! ```rust,ignore
73//! pub struct CvFoldGenerator {
74//! n_splits: usize,
75//! horizon: usize,
76//! gap: usize,
77//! }
78//!
79//! impl CvFoldGenerator {
80//! pub fn folds(&self, series_len: usize) -> Vec<usize> {
81//! // Returns training end indices for each fold
82//! }
83//! }
84//! ```
85//!
86//! This would allow fold generation to be shared while keeping data manipulation
87//! in the appropriate layer (SQL for multi-series datasets, Rust for single-series use).
88
89// Allow some clippy warnings for cleaner code in specific cases
90#![allow(clippy::upper_case_acronyms)]
91#![allow(clippy::too_many_arguments)]
92#![allow(clippy::type_complexity)]
93#![allow(clippy::needless_range_loop)]
94#![allow(clippy::manual_memcpy)]
95#![allow(clippy::manual_is_multiple_of)] // is_multiple_of is unstable on WASM
96
97// Prevent use of parallel feature on WASM targets (rayon requires OS threads)
98#[cfg(all(feature = "parallel", target_arch = "wasm32"))]
99compile_error!(
100 "The 'parallel' feature is not supported on WASM targets. Build without --features parallel"
101);
102
103pub mod changepoint;
104pub mod core;
105pub mod detection;
106pub mod error;
107pub mod features;
108pub mod models;
109#[cfg(feature = "postprocess")]
110pub mod postprocess;
111pub mod seasonality;
112pub mod simd;
113pub mod transform;
114pub mod utils;
115pub mod validation;
116
117pub use error::{ForecastError, Result};
118
119pub mod prelude {
120 pub use crate::core::{Forecast, TimeSeries};
121 pub use crate::error::{ForecastError, Result};
122 pub use crate::models::Forecaster;
123 pub use crate::utils::{calculate_metrics, quantile_normal, AccuracyMetrics};
124}