Anomaly Grid
█████╗ ███╗ ██╗ ██████╗ ███╗ ███╗ █████╗ ██╗ ██╗ ██╗
██╔══██╗████╗ ██║██╔═══██╗████╗ ████║██╔══██╗██║ ╚██╗ ██╔╝
███████║██╔██╗ ██║██║ ██║██╔████╔██║███████║██║ ╚████╔╝
██╔══██║██║╚██╗██║██║ ██║██║╚██╔╝██║██╔══██║██║ ╚██╔╝
██║ ██║██║ ╚████║╚██████╔╝██║ ╚═╝ ██║██║ ██║███████╗██║
╚═╝ ╚═╝╚═╝ ╚═══╝ ╚═════╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚══════╝╚═╝
[ANOMALY-GRID v0.4.0] - SEQUENCE ANOMALY DETECTION ENGINE
A Rust library implementing variable-order Markov chains for sequence anomaly detection in finite alphabets.
To use a Python wrapper of this library implementations refer, to my other repository at: https://github.com/Abimael10/anomaly-grid-py
Quick Start
[]
= "0.4.0"
use *;
What This Library Does
- Variable-Order Markov Models: Builds contexts of length 1 to max_order from training sequences with hierarchical context selection
- Adaptive Context Selection: Uses longest available context with sufficient data, falls back to shorter contexts automatically
- Information-Theoretic Scoring: Shannon entropy and KL divergence calculations with lazy computation and caching
- Memory-Optimized Storage: String interning, trie-based context storage with prefix sharing, and SmallVec for efficient small collections
- Parallel Batch Processing: Processes multiple sequences concurrently using Rayon for improved throughput
- Comprehensive Testing: Extensive unit, integration, domain, and performance validation with mathematical correctness verification
Configuration
let config = default
.with_max_order? // Higher order = more memory, better accuracy
.with_smoothing_alpha? // Lower = more sensitive to training data
.with_weights? // Likelihood vs information weight
.with_memory_limit; // 100MB memory limit
let detector = with_config?;
Use Cases
Excellent Fit
- Software Development Workflows: Git command sequences, CI/CD pipeline analysis, code review patterns
- Database Query Optimization: SQL operation sequences, transaction pattern analysis, N+1 query detection
- Network Protocol Analysis: TCP/HTTP/TLS state transitions, protocol compliance verification, traffic flow analysis
- System Administration: CLI command sequences, automation pattern detection, user proficiency analysis
- Creative Pattern Analysis: Musical composition analysis, artistic workflow patterns, style classification
- Security Monitoring: Login sequences, access patterns, behavioral anomaly detection
- IoT and Sensor Networks: Device state transitions, sensor reading patterns, equipment health monitoring
Good Fit
- Business Process Mining: Workflow step sequences, process compliance, bottleneck identification
- User Experience Analysis: Click sequences, navigation patterns, conversion funnel analysis
- Manufacturing Quality Control: Production step sequences, assembly line monitoring, defect pattern detection
- Financial Transaction Analysis: Payment sequences, fraud pattern detection, risk assessment
- Healthcare Workflow Analysis: Treatment sequences, care pathway optimization, protocol adherence
Requires Preprocessing
- Natural Language Processing: Tokenize to categorical sequences (POS tags, named entities, semantic categories)
- Time Series Data: Discretize continuous values into categorical states or trend patterns
- High-Resolution Sensor Data: Aggregate into categorical states or pattern classifications
- Large Vocabularies: Apply dimensionality reduction or clustering to create manageable alphabets
Poor Fit
- Raw Continuous Data: Unprocessed sensor readings, audio waveforms, high-frequency financial data
- Extremely Large Alphabets: >1000 unique states without preprocessing
- Real-Time Streaming: Microsecond-latency requirements (though batch processing is efficient)
- Unstructured Data: Images, videos, raw binary data without categorical interpretation
Testing
# Run all tests
# Run specific test suites
# Run examples
Documentation
- Complete Documentation - Comprehensive guides and API reference
- API Reference - Online API documentation
- Examples - Production-ready examples with validation
- Changelog - Version history and changes
Dependencies
[]
= "1.10.0" # Parallel batch processing
= "1.13.0" # Memory-efficient small collections
Minimal dependencies for core functionality and memory optimization.
License
MIT License - see LICENSE file.
Performance Note: The library efficiently handles alphabets up to ~100 unique states with excellent memory usage (typically <100MB). For larger alphabets, consider preprocessing techniques like clustering, dimensionality reduction, or hierarchical categorization.