driller 0.10.2

A clean HTTP load-test drill. Ansible-style YAML plans, Rust runtime, RPS and percentiles per run -- no fancy bits.
# CLI reference

## `driller run` -- execute a benchmark or ad-hoc request

```
Usage: driller run [OPTIONS] [URL]
```

### Arguments

| Argument | Description |
|---|---|
| `[URL]` | Target URL for ad-hoc testing (creates a synthetic GET request) |

### Run-specific options

| Flag | Short | Description |
|---|---|---|
| `--base-url <URL>` | `-u` | Override the base URL from the benchmark file |
| `--concurrency <N>` | `-p` | Number of concurrent requests (default: 1) |
| `--iterations <N>` | `-i` | Number of iterations (default: 1) |
| `--duration <DURATION>` | `-d` | Run for a fixed wall-clock duration (e.g. `30s`, `5m`, `1h`) |
| `--rampup <N>` | `-e` | Ramp-up time in seconds (default: 0) |

`--duration` and `--iterations` are mutually exclusive. When neither is given,
the default is 1 iteration.

When both a URL and `--benchmark` are provided, the benchmark file supplies the
plan and the URL sets the base URL.

### Examples

```bash
# Quick smoke test
driller run http://localhost:3000/health

# 10 concurrent users, 100 total iterations
driller run http://localhost:3000/api -p 10 -i 100 --stats

# Run a benchmark file for 60 seconds with overridden concurrency
driller run --benchmark bench.yml --duration 60s --concurrency 20 --stats

# Override the base URL to point at staging
driller run --benchmark bench.yml --base-url http://staging:3000 --stats
```

## Global options

These flags work with both `driller run` and the legacy `driller --benchmark`
form:

| Flag | Short | Description |
|---|---|---|
| `--benchmark <FILE>` | `-b` | Sets the benchmark file |
| `--stats` | `-s` | Shows request statistics |
| `--report <FILE>` | `-r` | Sets a report output file |
| `--compare <FILE>` | `-c` | Sets a comparison baseline file |
| `--threshold <MS>` | `-t` | Threshold in ms for comparison |
| `--relaxed-interpolations` | | Do not panic on missing interpolations |
| `--no-check-certificate` | | Disables SSL certificate verification |
| `--tags <TAGS>` | | Comma-separated tags to include |
| `--skip-tags <TAGS>` | | Comma-separated tags to exclude |
| `--list-tags` | | List all benchmark tags |
| `--list-tasks` | | List benchmark tasks (applies tag filters) |
| `--quiet` | `-q` | Disables output |
| `--timeout <SECONDS>` | `-o` | Request timeout in seconds (default: 10) |
| `--nanosec` | `-n` | Shows statistics in nanoseconds |
| `--verbose` | `-v` | Toggle verbose output |

## Configuration precedence

When using `driller run` with a benchmark file, values are resolved in three
layers (last wins):

1. **Defaults** -- concurrency=1, iterations=1, rampup=0
2. **Benchmark YAML** -- values from the file override defaults
3. **CLI flags** -- `--concurrency`, `--iterations`, etc. override the file

## Tokio runtime selection

`--worker-threads N` (alias `-w N`) selects which tokio runtime drives the
benchmark:

| `--worker-threads` | Runtime built                                 |
|--------------------|------------------------------------------------|
| omitted or `1`     | current-thread runtime (single OS thread)      |
| `N >= 2`           | multi-thread runtime with `N` worker threads   |
| `0` or invalid     | rejected at CLI parse time                     |

The default (current-thread) has the lowest per-request overhead and the most
predictable performance across workload sizes. The multi-thread runtime can
win significantly on some workloads but loses significantly on others. There
is no single `N` that dominates across payload sizes; tune to your workload
and measure.

Rough guidance from internal measurements on a 28-core macOS host hitting a
local target server with `-p 256` over a 10 s window:

| Response body size | Best `N`    | Multi-thread Δ vs current-thread (best `N`) |
|--------------------|-------------|----------------------------------------------|
| Small (~3 B - 8 KB)   | 2        | +55 % to +65 %                               |
| Medium (~64 KB)       | 1 (or 2) | -2 % to -9 % (multi-thread cannot beat ct)   |
| Large (~512 KB - 2 MB) | 16 - 28 | +17 % to +39 %                              |
| Very large (~10 MB)   | (insensitive) | -4 % to -6 %                            |
| Latency-bound (slow server) | 2  | +5 % to +11 %                                |

These numbers are illustrative, not portable -- the exact crossover points
depend on host CPU topology, network stack, and target behavior. Measure on
your own setup if `--worker-threads` matters to you.

### Examples (continued)

```bash
# Default current-thread runtime (matches behavior before 0.10.2)
driller run http://localhost:3000/api -p 64 -i 1000 --stats

# Try multi-thread with 4 workers for a large-body workload
driller run http://localhost:3000/large-payload -p 256 -d 30s -w 4 --stats
```

## Legacy invocation

The original `driller --benchmark <FILE>` form continues to work. It is
equivalent to `driller run --benchmark <FILE>`.