anomalyzer-ts
A fast, lightweight, probabilistic anomaly detection library for streaming time-series data.
Inspired by Etsy's Skyline.
Features
- Streaming — push values one at a time; no batch required
- Multiple statistical tests — magnitude, CDF, diff, rank, KS
- Configurable windows and methods — tune sensitivity, window sizes, and which tests to run
- Dynamic weighting — strong signals are amplified automatically
- Optional async support — non-blocking via
AsyncAnomalyzer(requires--features async) - Optional persistence — WAL + snapshot durability via
PersistentAnomalyzer(requires--features persist) - No heavy dependencies — just
ndarrayandrandin the default build
Installation
# Default (sync only)
= "0.1"
# With async support
= { = "0.1", = ["async"] }
# With persistence
= { = "0.1", = ["persist"] }
# All features
= { = "0.1", = ["async", "persist"] }
Or via cargo:
Quick Start
Sync
use ;
let conf = AnomalyzerConf ;
let mut detector = new.unwrap;
let prob = detector.push;
println!;
Async
Requires --features async. AsyncAnomalyzer wraps Anomalyzer behind a tokio::sync::Mutex
and offloads CPU-bound work to spawn_blocking so it never blocks the async executor.
use ;
async
Persistent
Requires --features persist. PersistentAnomalyzer pairs the detector with a WAL + snapshot
persistence manager. History is recovered automatically on restart — no manual save/load needed.
use ;
Persistence
How It Works
PersistentAnomalyzer uses a WAL (write-ahead log) + snapshot strategy, similar to Redis AOF+RDB or RocksDB:
<dir>/
anomalyzer.snap — latest compacted snapshot (bincode)
anomalyzer.snap.tmp — atomic write staging file
anomalyzer.wal — append-only log since last snapshot
On every push:
- The new value is appended to the WAL and flushed with
fsyncbefore returning. - After
snapshot_intervalpushes (default: 1 000), a new snapshot is written atomically and the WAL is truncated.
On startup (open):
- The latest snapshot is loaded.
- WAL entries written after the snapshot are replayed on top.
- Torn/corrupt WAL tails (e.g. from a crash mid-write) are silently skipped.
Durability Guarantees
- Each
pushis durable before it returns — a crash afterpushwill not lose that value. - Snapshots are written to
.snap.tmpthen atomically renamed, so a crash mid-snapshot never corrupts the previous good snapshot. - A corrupt or truncated WAL tail is tolerated; only complete records are replayed.
Compaction / Snapshot Interval
// Compact every 500 pushes instead of the default 1 000.
detector.set_snapshot_interval;
// Force an immediate snapshot + WAL truncation (e.g. on clean shutdown).
detector.flush?;
Diagnostics
// Bytes currently accumulated in the WAL.
let wal_bytes = detector.wal_size_bytes?;
// WAL entries pending since the last snapshot.
let pending = detector.pending_wal_entries;
Examples
Basic usage
Pushes a handful of values into a sync detector and prints the anomaly probability for each.
Simulated CPU monitoring with spike detection
Simulates 100 samples of normal CPU load followed by a spike sequence. Demonstrates fence and highrank methods alongside magnitude and cdf for threshold-aware detection.
Async basic
The async equivalent of basic. Shows normal values, a high spike, and a dip — flags anything above 0.7 probability.
Async multi-sensor
Runs three independent AsyncAnomalyzer detectors concurrently via tokio::join!, each with different configuration:
- temperature — magnitude + highrank, subtle baseline then a sharp spike
- pressure — magnitude + cdf, gradual climb over time
- vibration — ks + diff, noisy-but-bounded baseline then a burst
Async streaming pipeline
A realistic producer/consumer pipeline over a tokio::sync::mpsc channel. A producer task emits timestamped metrics; a detector loop feeds them into AsyncAnomalyzer and prints ✅ ok / ⚠️ warn / 🚨 ALERT status. Maps directly to production patterns such as Kafka consumers, WebSocket streams, or metrics scrape loops.
Persistent basic
Demonstrates WAL + snapshot persistence across two simulated "process runs": the first run seeds a baseline and pushes a spike, the second restores history from disk and confirms the spike is still detected.
Configuration
| Field | Type | Default | Description |
|---|---|---|---|
active_size |
usize |
1 |
Number of most-recent values treated as the active window |
n_seasons |
usize |
4 |
Number of seasons (reference window = n_seasons × active_size) |
sensitivity |
f64 |
0.1 |
Minimum magnitude change to flag; returns 0.0 if below threshold |
upper_bound |
f64 |
100.0 |
Upper fence boundary |
lower_bound |
f64 |
NA |
Lower fence boundary (NA = disabled) |
perm_count |
usize |
500 |
Permutations for rank/diff/KS tests |
methods |
Vec<String> |
["magnitude","cdf"] |
Detection methods to use (see below) |
Detection Methods
| Method | Description |
|---|---|
magnitude |
Relative mean shift between reference and active windows |
fence |
How far the active mean sits outside configured bounds |
cdf |
CDF-based test on first-difference volatility |
diff |
Permutation rank test on absolute differences |
highrank |
Permutation test — active values ranked higher than reference |
lowrank |
Permutation test — active values ranked lower than reference |
ks |
Bootstrap Kolmogorov-Smirnov distribution shift test |
Feature Flags
| Feature | Enables | Extra dependencies |
|---|---|---|
| (none) | Anomalyzer (sync) |
— |
async |
AsyncAnomalyzer |
tokio |
persist |
PersistentAnomalyzer |
serde, bincode |
License
Apache-2.0