FormicaX: Rust-Based Clustering Library for Stock Market Analysis.

Overview
FormicaX is a high-performance, Rust-based library designed for stock market analysis and prediction using OHLCV (Open, High, Low, Close, Volume) data. Leveraging Rust's safety and speed, FormicaX implements advanced machine learning clustering algorithms to generate predictive insights for stock trading. The library is tailored for developers and data scientists building trading applications or conducting financial research.
The name FormicaX, derived from "Formica" (Latin for ant), reflects the library's design principles:
- Collaboration: Like ants in a colony, FormicaX's algorithms work together to process data efficiently.
- Adaptability: Ants adapt to complex environments; FormicaX adapts to diverse market patterns.
- Resilience: Ant colonies are robust; FormicaX handles large datasets with Rust's performance.
- Exploration: Ants explore for resources; FormicaX uncovers hidden patterns in data.
- Simplicity: Individual ants are simple, yet powerful collectively; FormicaX offers a modular, user-friendly API.
The "X" signifies excellence, exploration, and extensibility, highlighting the library’s advanced and flexible capabilities.
Supported clustering algorithms:
- K-Means Clustering: Partitions data into K clusters by minimizing variance within clusters.
- Hierarchical Clustering: Builds a hierarchy of clusters using a bottom-up or top-down approach.
- DBSCAN: Groups data points into clusters based on density, handling noise and outliers.
- Gaussian Mixture Models (GMM): Models data as a mixture of Gaussian distributions for probabilistic clustering.
- Affinity Propagation: Identifies exemplars to form clusters without a predefined number of clusters.
- Self-Organizing Maps (SOM): Uses neural network-based dimensionality reduction to map data into a 2D grid.
FormicaX processes large datasets efficiently, making it suitable for real-time or near-real-time trading applications. It offers a modular framework for integration with trading systems or data pipelines.
Features
- High Performance: Built in Rust for speed and memory safety, optimized for large OHLCV datasets.
- Advanced Trading: VWAP-based strategies, real-time signal generation, and performance monitoring.
- Flexible Input: Supports OHLCV data in CSV, JSON, or custom formats via a configurable data loader.
- Multiple Algorithms: Implements six clustering algorithms for diverse analytical approaches.
- Customizable Parameters: Fine-tune algorithm hyperparameters for specific trading strategies.
- Prediction Outputs: Generates cluster-based predictions, including trend identification and anomaly detection.
- Extensibility: Modular design allows integration of new algorithms or preprocessing steps.
- Cross-Platform: Compatible with Linux, macOS, and Windows (experimental).
Installation
Prerequisites
- Rust: Version 1.80.0 or later, installed via rustup.
- Dependencies: Managed automatically via
Cargo.toml.
Steps
-
Clone the repository:
-
Build the library:
-
Add FormicaX as a dependency in your
Cargo.toml:[] = { = "./FormicaX" } -
(Optional) Run tests:
Usage
Basic Clustering Example
Cluster OHLCV data using K-Means and generate predictions:
use ;
Trading Strategy Example
Implement a VWAP-based trading strategy:
use ;
Data Format
OHLCV data should be structured, e.g., in CSV:
timestamp,open,high,low,close,volume
2025-07-01T09:30:00,100.5,102.0,99.8,101.2,100000
2025-07-01T09:31:00,101.3,103.5,100.7,102.8,120000
...
Custom data loaders for formats like JSON can be implemented via the DataLoader trait.
Algorithm Configuration
Configure hyperparameters, e.g.:
-
K-Means:
let kmeans = new; // 5 clusters, 200 iterations -
DBSCAN:
let dbscan = DBSCANnew; // Epsilon = 0.5, MinPts = 5 -
GMM:
let gmm = new; // 4 components, convergence threshold
See API documentation for details.
Supported Algorithms
- K-Means Clustering: Identifies market regimes by partitioning data.
- Hierarchical Clustering: Visualizes hierarchical relationships in stock data.
- DBSCAN: Detects anomalous trading patterns via density-based clustering.
- Gaussian Mixture Models (GMM): Models complex market behaviors probabilistically.
- Affinity Propagation: Exploratory analysis without predefined cluster counts.
- Self-Organizing Maps (SOM): Visualizes patterns via 2D mapping.
Building Trading Strategies
Clustering-Based Strategy
Integrate clustering algorithms into a trading pipeline:
- Preprocess: Normalize data using
Preprocessor. - Cluster: Apply clustering algorithms.
- Predict: Assign new data to clusters.
- Trade: Map clusters to signals (buy/sell/hold).
- Backtest: Use
Backtester(optional).
Example:
let clusters = kmeans.predict?;
if clusters == 1 else if clusters == 2
VWAP-Based Strategy
Use VWAP (Volume Weighted Average Price) for real-time trading:
use ;
// Create VWAP calculator
let mut vwap_calc = session_based;
// Create signal generator
let mut signal_gen = new;
// Process real-time data
for ohlcv in real_time_data
Contributing
- Fork the repository.
- Create a feature branch (
git checkout -b feature/your-feature). - Commit changes (
git commit -m "Add your feature"). - Push branch (
git push origin feature/your-feature). - Open a pull request.
See CONTRIBUTING.md for details. Report issues at GitHub Issues.
Building and Testing
- Build:
cargo build --release - Test:
cargo test - Docs:
cargo doc --open
API Documentation
Generated via:
Limitations
- Beta Status: Not recommended for production trading.
- Windows: Experimental support.
- Data Size: Large datasets may require streaming for scalability.
License
Apache-2.0. See LICENSE.
Contact
- GitHub: rustic-ml/FormicaX
- Issues: GitHub Issues
- Discord: Community Discord