# oxits
[](https://github.com/sipemu/oxits-rs/actions/workflows/ci.yml)
[](https://crates.io/crates/oxits)
[](https://docs.rs/oxits)
[](https://github.com/sipemu/oxits-rs/actions/workflows/ci.yml)
[](./LICENSE)
A high-performance time series classification and transformation library for Rust, validated against [pyts](https://github.com/johannfaouzi/pyts).
## Features
### Preprocessing
- **StandardScaler** — zero-mean, unit-variance normalization
- **MinMaxScaler** — scale to arbitrary range
- **MaxAbsScaler** — scale by maximum absolute value
- **RobustScaler** — median/IQR-based scaling
- **KBinsDiscretizer** — binning with normal, uniform, and quantile strategies
- **PowerTransform** — Box-Cox and Yeo-Johnson transforms
- **QuantileTransform** — uniform or normal output distribution
- **Imputer** — fill NaN values (nearest, previous, next, linear)
### Approximation
- **PAA** — Piecewise Aggregate Approximation
- **SAX** — Symbolic Aggregate Approximation
- **DFT** — Discrete Fourier Transform coefficients
- **SFA** — Symbolic Fourier Approximation (DFT → MCB discretization, ANOVA feature selection)
### Metrics
- **DTW** — Dynamic Time Warping (classic, Sakoe-Chiba band, Itakura parallelogram, multiscale, fast)
- **Lower bounds** — LB_Kim, LB_Keogh, LB_Improved, LB_Yi
- **BOSS metric** — histogram intersection distance
### Image Transforms
- **GASF** — Gramian Angular Summation Field
- **GADF** — Gramian Angular Difference Field
- **MTF** — Markov Transition Field
- **RecurrencePlot** — recurrence plot with time-delay embedding
### Decomposition
- **SSA** — Singular Spectrum Analysis with automatic trend/seasonal/residual grouping
### Transformation
- **BOSS** — Bag of SFA Symbols
- **ROCKET** — Random Convolutional Kernel Transform
- **BagOfPatterns** — sliding-window SAX bag of words with TF-IDF
- **ShapeletTransform** — shapelet-based feature extraction
- **WEASEL** — Word ExtrAction for time SEries cLassification
### Classification
- **KNN** — k-nearest neighbors with pluggable distance metrics
- **BOSSVS** — BOSS in Vector Space (TF-IDF cosine similarity)
- **SAXVSM** — SAX-VSM classifier
- **TimeSeriesForest** — interval-based random forest
- **TSBF** — Time Series Bag of Features
- **LearningShapelets** — gradient-descent shapelet learning
### Multivariate
- **JointRecurrencePlot** — joint recurrence plots for multivariate time series
- **Multivariate wrapper** — apply univariate transforms/classifiers per channel
### Datasets
- **UCR Archive** — fetch datasets from the UCR Time Series Archive
- **Synthetic generators** — Cylinder-Bell-Funnel (CBF) dataset
- **Built-in** — GunPoint, Coffee synthetic datasets
### Infrastructure
- **Parallel computation** — feature-gated Rayon parallelism across all modules
- **Core traits** — `Transformer`, `FittableTransformer`, `Classifier`, `DistanceMetric`
- **SIMD autovectorization** — AVX2 on x86-64 via `target-cpu=native`
## Performance
See [PERFORMANCE.md](PERFORMANCE.md) for detailed benchmark tables against pyts across all algorithms.
Benchmarked on Intel i9-13900H (P25 of 51 runs, 5 warmup):
| StandardScaler | **12.3x** | | GASF | **3.1x** |
| MinMaxScaler | **7.0x** | | MTF | **6.0x** |
| KBinsDiscretizer | **10.7x** | | RecurrencePlot | **3.7x** |
| SAX | **7.9x** | | SSA | **10.5x** |
| DFT | **7.4x** | | BOSS | **2.9x** |
| DTW fast | **34.4x** | | ROCKET | **10.4x** |
| KNN | **16.8x** | | ShapeletTransform | **131.2x** |
| BOSSVS | **3.3x** | | TimeSeriesForest | **4.8x** |
| **Geometric mean** | **5.0x** | | **Median** | **3.7x** |
## Installation
Add to your `Cargo.toml`:
```toml
[dependencies]
oxits = "0.1"
```
## Quick Start
### Stateless Transform (StandardScaler)
```rust
use oxits::preprocessing::scaler::{StandardScaler, StandardScalerConfig};
use oxits::Transformer;
let config = StandardScalerConfig::new();
let x = vec![vec![1.0, 2.0, 3.0, 4.0, 5.0]];
let scaled = StandardScaler::transform(&config, &x);
```
### Stateful Transform (SFA)
```rust
use oxits::approximation::sfa::{Sfa, SfaConfig};
use oxits::FittableTransformer;
let config = SfaConfig { n_coefs: Some(4), n_bins: 4, ..SfaConfig::new() };
let x = vec![
vec![0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0],
vec![7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0, 0.0],
];
let fitted = Sfa::fit(&config, &x, None);
let result = Sfa::transform(&fitted, &x);
```
### Classification (BOSSVS)
```rust
use oxits::classification::bossvs::{Bossvs, BossvsConfig};
let config = BossvsConfig::new(4);
let x_train = vec![
vec![0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0],
vec![7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0, 0.0],
];
let y_train = vec!["A".to_string(), "B".to_string()];
let fitted = Bossvs::fit(&config, &x_train, &y_train);
let predictions = Bossvs::predict(&fitted, &x_train);
```
### Distance Metrics (DTW)
```rust
use oxits::metrics::dtw::dtw_classic;
let a = vec![1.0, 2.0, 3.0, 4.0, 5.0];
let b = vec![1.0, 2.5, 3.5, 4.0, 5.0];
let distance = dtw_classic(&a, &b);
```
### Image Transform (GASF)
```rust
use oxits::image::gaf::{Gaf, GafConfig};
use oxits::{GafMethod, Transformer};
let config = GafConfig { method: GafMethod::Summation, image_size: None };
let x = vec![vec![0.0, 1.0, 2.0, 3.0, 4.0]];
let images = Gaf::transform(&config, &x);
// images[0] is a 5x5 Gramian Angular Summation Field
```
## Cargo Features
| `parallel` | yes | Parallel computation via Rayon |
| `decomposition` | no | SSA with nalgebra SVD |
| `datasets` | no | UCR Archive fetching via ureq |
| `validation` | no | Golden data tests via serde |
```bash
# Default (parallel)
cargo build --release
# All features
cargo build --release --all-features
# No parallelism
cargo build --release --no-default-features
```
## Building
```bash
cargo build --release
cargo test --all-features
```
For best performance, ensure `.cargo/config.toml` targets your CPU:
```toml
[target.x86_64-unknown-linux-gnu]
rustflags = ["-C", "target-cpu=native"]
```
## Validation
All modules are validated against pyts via golden integration tests. Each test loads reference data generated by pyts and compares output at epsilon < 1e-6:
```bash
cargo test --all-features # run all tests (229 total)
cargo test --test golden_image # image module golden tests
cargo test --test golden_metrics # DTW golden tests
```
To regenerate golden data:
```bash
cd test_harness
pip install pyts numpy scikit-learn
python generate_golden_data.py
```
## Dependencies
- [realfft](https://crates.io/crates/realfft) — FFT for DFT, SFA, and SSA periodograms
- [rayon](https://crates.io/crates/rayon) — parallel computation (optional)
- [rand](https://crates.io/crates/rand) / [rand_distr](https://crates.io/crates/rand_distr) — random kernels for ROCKET
- [nalgebra](https://crates.io/crates/nalgebra) — SVD for SSA decomposition (optional)
- [ureq](https://crates.io/crates/ureq) — HTTP client for UCR Archive (optional)
## License
MIT License — see [LICENSE](LICENSE).