anofox_forecast/
lib.rs

1//! # anofox-forecast
2//!
3//! Time series forecasting library for Rust.
4//!
5//! Provides 35+ forecasting models including ARIMA, ETS, Theta,
6//! and baseline methods, along with seasonality decomposition (STL/MSTL),
7//! changepoint detection, and outlier detection.
8//!
9//! For comprehensive periodicity detection, see the
10//! [fdars](https://crates.io/crates/fdars-core) crate.
11//!
12//! # Architecture Decisions
13//!
14//! ## Cross-Validation Split (`ts_cv_split`)
15//!
16//! Time series cross-validation with data leakage prevention is implemented in the
17//! [forecast-extension](https://github.com/DataZooDE/forecast-extension) DuckDB extension
18//! rather than in this crate. This section documents the rationale.
19//!
20//! ### Why CV Split is Not Part of `TimeSeries`
21//!
22//! The [`TimeSeries`](crate::core::TimeSeries) struct represents a single time series with
23//! its values, timestamps, and metadata. Cross-validation splitting was considered as a
24//! method on `TimeSeries` but was intentionally kept separate for these reasons:
25//!
26//! 1. **Cross-series coordination**: Fold generation is a global operation across multiple
27//!    series, not a per-series operation. CV requires consistent fold boundaries across all
28//!    series in a dataset.
29//!
30//! 2. **External feature handling**: Unknown future features like `stockout` flags or
31//!    `segment_id` changes are external columns that don't belong in the series data model.
32//!    These require schema-aware handling at the data layer.
33//!
34//! 3. **Data manipulation efficiency**: DuckDB's vectorized execution is more efficient for
35//!    the bulk data operations (filtering, joining, filling) that CV split requires.
36//!
37//! 4. **Schema flexibility**: SQL macros can handle arbitrary column schemas without
38//!    requiring Rust to know the schema at compile time.
39//!
40//! ### Component Distribution
41//!
42//! | Component | Location | Rationale |
43//! |-----------|----------|-----------|
44//! | Fold generation | DuckDB extension | Cross-series coordination, global operation |
45//! | Train/test assignment | SQL/DuckDB | Simple comparison, vectorized execution |
46//! | Unknown feature filling | Rust UDF via DuckDB | Per-series state tracking |
47//! | Orchestration | SQL macro | Flexible, schema-agnostic |
48//!
49//! ### Using CV Functionality
50//!
51//! For time series cross-validation with data leakage prevention, use the `ts_cv_split`
52//! function from the [forecast-extension](https://github.com/DataZooDE/forecast-extension):
53//!
54//! ```sql
55//! -- Example: Generate CV folds with unknown feature handling
56//! SELECT * FROM ts_cv_split(
57//!     my_data,
58//!     n_splits := 3,
59//!     horizon := 7,
60//!     unknown_features := ['stockout', 'segment_id']
61//! );
62//! ```
63//!
64//! See [forecast-extension#54](https://github.com/DataZooDE/forecast-extension/issues/54)
65//! for implementation details.
66//!
67//! ### Future Considerations
68//!
69//! If per-series CV semantics become necessary in Rust (e.g., for standalone use without
70//! DuckDB), the fold generation logic could be extracted:
71//!
72//! ```rust,ignore
73//! pub struct CvFoldGenerator {
74//!     n_splits: usize,
75//!     horizon: usize,
76//!     gap: usize,
77//! }
78//!
79//! impl CvFoldGenerator {
80//!     pub fn folds(&self, series_len: usize) -> Vec<usize> {
81//!         // Returns training end indices for each fold
82//!     }
83//! }
84//! ```
85//!
86//! This would allow fold generation to be shared while keeping data manipulation
87//! in the appropriate layer (SQL for multi-series datasets, Rust for single-series use).
88
89// Allow some clippy warnings for cleaner code in specific cases
90#![allow(clippy::upper_case_acronyms)]
91#![allow(clippy::too_many_arguments)]
92#![allow(clippy::type_complexity)]
93#![allow(clippy::needless_range_loop)]
94#![allow(clippy::manual_memcpy)]
95#![allow(clippy::manual_is_multiple_of)] // is_multiple_of is unstable on WASM
96
97// Prevent use of parallel feature on WASM targets (rayon requires OS threads)
98#[cfg(all(feature = "parallel", target_arch = "wasm32"))]
99compile_error!(
100    "The 'parallel' feature is not supported on WASM targets. Build without --features parallel"
101);
102
103pub mod changepoint;
104pub mod core;
105pub mod detection;
106pub mod error;
107pub mod features;
108pub mod models;
109#[cfg(feature = "postprocess")]
110pub mod postprocess;
111pub mod seasonality;
112pub mod simd;
113pub mod transform;
114pub mod utils;
115pub mod validation;
116
117pub use error::{ForecastError, Result};
118
119pub mod prelude {
120    pub use crate::core::{Forecast, TimeSeries};
121    pub use crate::error::{ForecastError, Result};
122    pub use crate::models::Forecaster;
123    pub use crate::utils::{calculate_metrics, quantile_normal, AccuracyMetrics};
124}