# HFT Backtesting Engine (Rust)
> [!NOTE]
> This project is currently under active development. Built-in functionality, APIs, and the overall architecture are subject to breaking changes. It is not yet ready for production use.**
A high-performance, event-driven backtesting engine written in Rust, designed for simulating tick-by-tick trading strategies with strict spread handling.
## 🚀 Features
* **Event-Driven Architecture**: Simulates market conditions row-by-row (tick-by-tick) rather than vectorized, ensuring realistic order execution.
* **High Performance**: Uses `tickparser` mapping for zero-allocation parsing in the hot loop.
* **Strict Spread Simulation**:
* **Long**: Open at **Ask**, Close at **Bid**.
* **Short**: Open at **Bid**, Close at **Ask**.
* **Built-in Risk Management**: Handles Stop Loss (SL) and Take Profit (TP) internally on every tick.
* **Flexible Strategy API**: Implement the `Strategy` trait to define custom logic.
* **Comprehensive Analysis**: Calculates Sharpe Ratio, Max Drawdown, Win Rate, and exports detailed trade logs to JSON.
* **Data Validation & Quality Scoring**: Built-in CLI to analyze tick data quality, detect missing time gaps, and heuristically infer asset types.
* **Live CLI Dashboard**: Real-time progress and metrics via `indicatif`.
## 📦 Installation
Add this to your `Cargo.toml`:
```toml
[dependencies]
backtest_rs = { path = "." } # If local, or git url
```
## 🛠CLI Usage
The project includes a robust CLI using `clap` to either validate datasets or run standard backtests.
### 1. Validate Dataset
Check the health, missing data gaps, and data quality score of your tick CSV file:
```bash
cargo run --release -- validate "path/to/tick_data.csv"
```
This generates a detailed report including:
* Total Ticks, Price Range, and Volatility
* Average and Maximum Spread
* Data Quality Score (penalizing for missing weekday data)
* Extracted list of largest time gaps (ignoring standard weekends)
* Inferred Asset Class (e.g., Gold, Forex Pair)
### 2. Run Backtest
Run a simulation using your specified strategy and dataset:
```bash
cargo run --release -- backtest "path/to/tick_data.csv"
```
The engine provides a sleek terminal interface using `indicatif` that updates in-place during the backtest, showing:
* **Progress**: Progress bar, ETA, and bytes processed
* **Results Summary**: Automatically printed upon completion.
## 💻 Library Usage
You can use `backtest_rs` flexibly as a library by implementing your own custom strategies.
### 1. Define Your Strategy
Implement the `Strategy` trait. You get a `Context` to place orders (`buy`, `sell`, `close_all`) and access account state.
```rust
use backtest_rs::{Strategy, Context, Tick, Signal};
struct MyStrategy;
impl Strategy for MyStrategy {
fn on_tick(&mut self, tick: &Tick, ctx: &mut Context) {
// Example: Buy if Bid > 2000.0
if tick.bid > 2000.0 && ctx.active_positions.is_empty() {
ctx.buy(1.0, Some(1995.0), Some(2010.0)); // Vol 1.0, SL, TP
}
}
}
```
### 2. Initialize and Run the Engine
```rust
use backtest_rs::{Engine, Analysis};
fn main() -> anyhow::Result<()> {
let strategy = MyStrategy;
let mut engine = Engine::new(strategy, 100_000.0); // $100k Balance
// Returns an Analysis struct
let analysis = engine.run("path/to/tick_data.csv")?;
println!("Total Return: {:.2}%", analysis.total_return_pct);
analysis.save_json("results.json")?;
Ok(())
}
```
See `examples/main.rs` for a complete example implementing a Simple RSI logic algorithm. Run it with:
```bash
cargo run --release --example main -- "path/to/tick_data.csv"
```
## 📂 Project Structure
* `src/lib.rs`: Library entry point exporting modules.
* `src/main.rs`: CLI Entry point handling `validate` and `backtest` subcommands.
* `src/strategy.rs`: `Strategy` trait and built-in examples (SMA).
* `src/engine.rs`: Core event loop, order matching, PnL calculation, and terminal progress bar.
* `src/types.rs`: Data structures (`Tick`, `Position`, `Trade`, etc.).
* `src/math.rs`: Helper functions for indicators (SMA, EMA).
* `src/validation.rs`: Data quality analysis, missing days checking, and gap detection algorithms.
* `src/analysis.rs`: Performance metrics and JSON export logic.
## ⚡ Performance — tickparser Integration
The engine now uses [tickparser](crates/tickparser/) as its **default parsing backend**, replacing the previous `csv` + `fast_float` + `chrono` pipeline.
| CSV parsing | `csv::ByteRecord` (UTF-8 validation, heap alloc/row) | Zero-copy byte scanning via `mmap` |
| Timestamp | `chrono::NaiveDateTime::parse_from_str` (tries 6 formats) | `parse_timestamp_bytes` — direct byte arithmetic |
| Float | `fast_float::parse` (requires `&[u8]` → `&str`) | `parse_float_bytes` — integer accumulation, no string conversion |
| I/O | Buffered `File::read` | Memory-mapped file (`memmap2`) |
### Running Benchmarks
```bash
# Benchmark all parser modes (V1 naive, V2 mmap, V3 parallel, V3 stream)
cargo run --release --bin bench -p tickparse
```
The benchmark generates synthetic tick data (1M–100M rows) and compares all parser versions. V3 parallel is the **default** — it uses rayon + mmap for maximum throughput.
### What's Different in the Engine
* **No `csv` crate**: The engine uses `tickparse::mmap_parser::MmapTickIterator` directly on raw bytes.
* **No format guessing**: Timestamps are always `YYYY.MM.DD HH:MM:SS.mmm` — parsed via byte offsets in ~10 ns/row.
* **No per-row allocation**: The iterator yields `Tick` structs from mmap'd memory without heap allocation.
## 📄 License
BSD 3-Clause License