backtest_rs 0.1.0

High-performance, event-driven HFT backtesting engine. Supports tick data and strict spread simulation.
Documentation
# HFT Backtesting Engine (Rust)

> [!NOTE]
> This project is currently under active development. Built-in functionality, APIs, and the overall architecture are subject to breaking changes. It is not yet ready for production use.**

A high-performance, event-driven backtesting engine written in Rust, designed for simulating tick-by-tick trading strategies with strict spread handling.

## 🚀 Features

*   **Event-Driven Architecture**: Simulates market conditions row-by-row (tick-by-tick) rather than vectorized, ensuring realistic order execution.
*   **High Performance**: Uses `tickparser` mapping for zero-allocation parsing in the hot loop.
*   **Strict Spread Simulation**:
    *   **Long**: Open at **Ask**, Close at **Bid**.
    *   **Short**: Open at **Bid**, Close at **Ask**.
*   **Built-in Risk Management**: Handles Stop Loss (SL) and Take Profit (TP) internally on every tick.
*   **Flexible Strategy API**: Implement the `Strategy` trait to define custom logic.
*   **Comprehensive Analysis**: Calculates Sharpe Ratio, Max Drawdown, Win Rate, and exports detailed trade logs to JSON.
*   **Data Validation & Quality Scoring**: Built-in CLI to analyze tick data quality, detect missing time gaps, and heuristically infer asset types.
*   **Live CLI Dashboard**: Real-time progress and metrics via `indicatif`.

## 📦 Installation

Add this to your `Cargo.toml`:

```toml
[dependencies]
backtest_rs = { path = "." } # If local, or git url
```

## 🛠 CLI Usage

The project includes a robust CLI using `clap` to either validate datasets or run standard backtests.

### 1. Validate Dataset
Check the health, missing data gaps, and data quality score of your tick CSV file:
```bash
cargo run --release -- validate "path/to/tick_data.csv"
```
This generates a detailed report including:
* Total Ticks, Price Range, and Volatility
* Average and Maximum Spread
* Data Quality Score (penalizing for missing weekday data)
* Extracted list of largest time gaps (ignoring standard weekends)
* Inferred Asset Class (e.g., Gold, Forex Pair)

### 2. Run Backtest
Run a simulation using your specified strategy and dataset:
```bash
cargo run --release -- backtest "path/to/tick_data.csv"
```

The engine provides a sleek terminal interface using `indicatif` that updates in-place during the backtest, showing:
*   **Progress**: Progress bar, ETA, and bytes processed
*   **Results Summary**: Automatically printed upon completion.

## 💻 Library Usage

You can use `backtest_rs` flexibly as a library by implementing your own custom strategies.

### 1. Define Your Strategy

Implement the `Strategy` trait. You get a `Context` to place orders (`buy`, `sell`, `close_all`) and access account state.

```rust
use backtest_rs::{Strategy, Context, Tick, Signal};

struct MyStrategy;

impl Strategy for MyStrategy {
    fn on_tick(&mut self, tick: &Tick, ctx: &mut Context) {
        // Example: Buy if Bid > 2000.0
        if tick.bid > 2000.0 && ctx.active_positions.is_empty() {
             ctx.buy(1.0, Some(1995.0), Some(2010.0)); // Vol 1.0, SL, TP
        }
    }
}
```

### 2. Initialize and Run the Engine

```rust
use backtest_rs::{Engine, Analysis};

fn main() -> anyhow::Result<()> {
    let strategy = MyStrategy;
    let mut engine = Engine::new(strategy, 100_000.0); // $100k Balance
    
    // Returns an Analysis struct
    let analysis = engine.run("path/to/tick_data.csv")?;
    
    println!("Total Return: {:.2}%", analysis.total_return_pct);
    analysis.save_json("results.json")?;
    
    Ok(())
}
```

See `examples/main.rs` for a complete example implementing a Simple RSI logic algorithm. Run it with:
```bash
cargo run --release --example main -- "path/to/tick_data.csv"
```

## 📂 Project Structure

*   `src/lib.rs`: Library entry point exporting modules.
*   `src/main.rs`: CLI Entry point handling `validate` and `backtest` subcommands.
*   `src/strategy.rs`: `Strategy` trait and built-in examples (SMA).
*   `src/engine.rs`: Core event loop, order matching, PnL calculation, and terminal progress bar.
*   `src/types.rs`: Data structures (`Tick`, `Position`, `Trade`, etc.).
*   `src/math.rs`: Helper functions for indicators (SMA, EMA).
*   `src/validation.rs`: Data quality analysis, missing days checking, and gap detection algorithms.
*   `src/analysis.rs`: Performance metrics and JSON export logic.

## ⚡ Performance — tickparser Integration

The engine now uses [tickparser](crates/tickparser/) as its **default parsing backend**, replacing the previous `csv` + `fast_float` + `chrono` pipeline.

| Component | Before | After (tickparser) |
|---|---|---|
| CSV parsing | `csv::ByteRecord` (UTF-8 validation, heap alloc/row) | Zero-copy byte scanning via `mmap` |
| Timestamp | `chrono::NaiveDateTime::parse_from_str` (tries 6 formats) | `parse_timestamp_bytes` — direct byte arithmetic |
| Float | `fast_float::parse` (requires `&[u8]` → `&str`) | `parse_float_bytes` — integer accumulation, no string conversion |
| I/O | Buffered `File::read` | Memory-mapped file (`memmap2`) |

### Running Benchmarks

```bash
# Benchmark all parser modes (V1 naive, V2 mmap, V3 parallel, V3 stream)
cargo run --release --bin bench -p tickparse
```

The benchmark generates synthetic tick data (1M–100M rows) and compares all parser versions. V3 parallel is the **default** — it uses rayon + mmap for maximum throughput.

### What's Different in the Engine

*   **No `csv` crate**: The engine uses `tickparse::mmap_parser::MmapTickIterator` directly on raw bytes.
*   **No format guessing**: Timestamps are always `YYYY.MM.DD HH:MM:SS.mmm` — parsed via byte offsets in ~10 ns/row.
*   **No per-row allocation**: The iterator yields `Tick` structs from mmap'd memory without heap allocation.

## 📄 License

BSD 3-Clause License