ai-dispatch 8.13.0

Multi-AI CLI team orchestrator
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
# ai-dispatch — Multi-AI CLI Team Orchestrator

## Current Status (v8.3.0)

**Foundation (v1.x–v3.x):** Task dispatch, batch parallel, worktree isolation, retry chains, webhook notifications, prompt templates, TUI multipane with charts, modular architecture, agent benchmark, zombie cleanup, UTF-8 safety, task classifier with capability matrix.

**Agent Ecosystem (v4.x–v5.x):** Custom agent TOML format + registry, agent store with versioning, per-agent analytics, hook system, prompt compaction, agent memory (blackboard), task tree, workgroup summary, investigation lead, shared findings, broadcast, fast query, setup wizard, skill packages, merge safety.

**Intelligent Routing (v7.0–v7.5):**
- `--json` for show/board, trust tiers, strengths scoring, knowledge relevance filtering
- `--context-from` result forwarding, shared workspace, knowledge compaction
- Empty diff guard, foreground task timeout, `--cascade` model cascade
- Conditional batch chains (`on_success`/`on_fail`/`conditional` in TOML)
- Episodic memory with append-only versioning, success-weighted injection, multi-query search
- Budget-aware cost-efficiency routing, `--judge` flag for automatic AI review with auto-retry

**Collective Intelligence (v7.6–v7.7):**
- `--parent` for thread composition, completion summary generation, sibling context injection
- `--peer-review` scored critique, `--metric` for best-of quality measurement

**Autonomous Experiments + TUI (v7.8):**
- `aid experiment run` metric-driven dispatch loop, TUI tree view with workgroup grouping
- Rolling context compression, batch milestone SQL, throttled metrics

**Code Health + Performance (v7.9):**
- File-split refactoring (run_bestof.rs, selection_scoring.rs, prompt_context.rs extracted)
- Milestone tag stripping from agent output
- Binary size: 9.5MB → 3.1MB (strip + LTO + ureq→curl)
- SQLite indexes on hot query paths (tasks, events), opt-level="z"

**Ecosystem Maturity (v8.0–v8.2):**
- Pre-dispatch validation, structured diff in `aid show --json`, loop detection
- Per-model capability scoring (`capability: f64` on AgentModel), model-weighted auto-selection
- Custom IDs (`--id` on `aid run` and `aid group create`, `id`/`group_id` in batch TOML)
- Work scope verification (`--scope` for post-completion file boundary checking)
- Cursor CLI 3-tier model support (auto/composer-1.5/opus-4.6-thinking/gpt-5.4-high)

**Live Task Control (v8.3):**
- `aid stop` / `aid kill` — graceful SIGTERM+wait and forced SIGKILL task termination
- `aid steer` — inject guidance into running PTY tasks mid-flight
- `Stopped` task status with full TUI/board/webhook integration

State is stored under `~/.aid` by default, or `AID_HOME` when overridden.

## Roadmap

### v0.8 delivered

- add caller-injected workgroups with `aid group` and `aid run --group`
- wire workgroup filtering into `board`, `watch`, and batch task dispatch
- preserve shared context across retries, background runs, and artifact inspection
- add workgroup lifecycle commands while keeping historical task tags on old tasks

### v0.9 delivered

- TUI scoped by task ID and workgroup (`aid watch --tui t-xxxx`, `aid watch --tui --group wg-xxxx`)
- `aid explain <task-id>` — dispatch task logs to cheap AI for failure summarization
- task dependency DAGs in batch files (`depends_on` + topological sort + level-based parallel dispatch)
- clippy-clean codebase, 129 tests passing

### v1.0 delivered

- fix worktree branch reuse bug (stale branches checked out at old commit)
- fix zombie background tasks (dead processes shown as Running forever)
- fix UTF-8 boundary panic in codex/opencode adapters (multi-byte chars)
- shared CARGO_TARGET_DIR for worktree tasks (build cache reuse)
- CLI consolidation: 17 commands → 11 (`show`, `ask`, `watch --quiet`, `config agents`)

## Problem

When using a primary AI (Claude Code) as a dispatcher to coordinate multiple AI CLI tools (Gemini, Codex, OpenCode, Cursor Agent), the workflow suffers from:

1. **No output standardization** — each tool has different stdout formats, stderr mixing, and exit conventions
2. **No progress visibility** — background tasks run blind until completion
3. **Boilerplate errors** — forgetting `-o json` for gemini, missing no-op guards for codex, piping through `tail` that destroys logs
4. **No audit trail** — task inputs, outputs, timing, and token costs scatter across temp files
5. **Manual worktree management** — creating/cleaning git worktrees for parallel code tasks is tedious

## Solution

A single CLI binary that wraps all AI CLI tools behind a unified dispatch/watch/audit interface. The dispatcher (human or AI) uses `aid` commands instead of raw `gemini`/`codex`/`opencode` calls.

## Core Design Principles

- **Zero config** — auto-detects installed AI CLIs, works immediately
- **Artifact-based** — every task produces inspectable files, never ephemeral stdout
- **Git-native** — automatic worktree creation/cleanup for code tasks
- **Cost-aware** — tracks token usage per task and per agent, reports totals
- **Dispatcher-friendly** — output designed for both human reading and AI parsing

## Architecture

```
┌─────────────────────────────────────┐
│           aid (CLI binary)          │
├──────┬──────┬──────┬───────┬────────┤
│ run  │ watch│ audit│ board │ usage  │  ← subcommands
├──────┴──────┴──────┴───────┴────────┤
│           Task Manager              │
│  ┌────────┐ ┌────────┐ ┌────────┐  │
│  │ Agent  │ │ Watch  │ │ Store  │  │
│  │Registry│ │ Engine │ │(SQLite)│  │
│  └────┬───┘ └────┬───┘ └────┬───┘  │
│       │          │          │       │
├───────┴──────────┴──────────┴───────┤
│         Agent Adapters              │
│  ┌──────┐ ┌─────┐ ┌────────┐       │
│  │Gemini│ │Codex│ │OpenCode│ ...   │
│  └──────┘ └─────┘ └────────┘       │
└─────────────────────────────────────┘
```

## Subcommands

### `aid run` — Dispatch a task

```bash
# Research task (gemini)
aid run gemini "What is the DODO V2 PMM mechanism?" -o /tmp/dodo_research.md

# Code task with auto-worktree (codex)
aid run codex "Implement DODO calldata encoding" --worktree feat/dodo-calldata --dir ./

# Code task with free model (opencode)
aid run opencode "Add type annotations to src/lib.rs" --model mimo-v2-flash-free --dir ./

# Retry failed runs with exponential backoff
aid run codex "Fix the retry path" --dir ./ --verify auto --retry 2

# Reuse shared context from a workgroup
aid run codex "Implement the TUI filter row" --group wg-a3f1 --dir ./

# Background dispatch (returns task ID immediately)
aid run codex "Add tests for quote handler" --bg --worktree feat/quote-tests --dir ./
# => Task t-3a7f started in background
```

**What `aid run` does under the hood:**
1. Creates worktree (if `--worktree`)
2. Injects agent-specific best practices into the prompt:
   - Codex: no-op guard, commit message format
   - Gemini: nothing (prompt passthrough)
   - OpenCode: model selection
3. Launches the agent process, capturing full stdout+stderr to `~/.aid/logs/<task_id>.jsonl`
4. If `--bg` is set, persists a detached worker spec under `~/.aid/jobs/`
5. Records task metadata in SQLite

### `aid group` — Reuse shared context

```bash
aid group create dispatch --context "Shared repo rules, API constraints, and rollout notes"
aid group list
aid group show wg-a3f1
```

Each workgroup stores caller-injected context once. Tasks launched with `aid run --group <id>`
inherit that shared context before any per-task file context is injected. Batch tasks can also
set `group = "wg-a3f1"` in TOML, and both `aid board` and `aid watch` can filter to that group.

### `aid watch` — Live progress dashboard

```bash
aid watch            # Text mode for running tasks
aid watch t-3a7f     # Follow a specific task
aid watch --group wg-a3f1
aid watch --tui      # Interactive ratatui dashboard
aid watch --quiet    # Block until current running tasks finish
aid watch --quiet t-3a7f  # Block until one task finishes
```

```
┌ ai-dispatch board ──────────────────────────────┐
│                                                  │
│ [t-3a7f] codex: Add tests for quote handler      │
│   Status: RUNNING (2m 13s)                       │
│   Progress: 8 shell calls, 12 steps              │
│   Last: cargo test -p sr-service (5s ago)        │
│                                                  │
│ [t-b201] gemini: Research Balancer V3 hooks      │
│   Status: DONE (47s, 3,201 tokens)               │
│   Output: /tmp/balancer_research.md              │
│                                                  │
│ [t-c8e9] opencode: Type annotations              │
│   Status: DONE (1m 02s, FREE)                    │
│   Worktree: /tmp/wt-c8e9 (feat/type-annotations)│
│   Changes: 3 files, +42 -8                       │
│                                                  │
└──────────────────────────────────────────────────┘
```

### `aid show` — Inspect task artifacts

```bash
aid show t-3a7f             # Default: events + stderr + diff stat
aid show t-3a7f --diff      # Full worktree diff
aid show t-3a7f --output    # Print output file
aid show t-3a7f --log       # Print raw log file
aid show t-3a7f --explain   # Dispatch AI summary (creates child task)
```

Default output:
```
Task: t-3a7f — codex: Add tests for quote handler
Duration: 3m 47s
Tokens: 45,210 (codex/gpt-5.4)
Worktree: /tmp/wt-3a7f (feat/quote-tests)

Changes:
  M crates/sr-service/src/api_quote.rs  (+12 -3)
  A crates/sr-service/tests/quote_test.rs  (+87)

Tests: cargo test -p sr-service — 42 passed, 0 failed
Commit: a1b2c3d "feat: add quote handler tests"

Watcher Events:
  14:30:01  Started
  14:30:15  ... 3 shell calls, 5 steps
  14:31:02  BUILD: cargo check passed
  14:32:30  TEST: 42 passed; 0 failed
  14:33:48  COMMIT: [feat/quote-tests a1b2c3d] feat: add quote handler tests
  14:33:48  COMPLETED — 12 shell calls, 18 steps, 45210 tokens
```

### `aid board` — Summary of all tasks

```bash
aid board                    # All tasks
aid board --today            # Today's tasks
aid board --running          # Only running
aid board --group wg-a3f1    # Only tasks in one workgroup
aid board --mine             # Only tasks from the current caller session
```

```
Today: 5 tasks | 3 done | 1 running | 1 failed
Total tokens: 231,408 | Est. cost: $0.34

ID       Agent    Status  Duration  Tokens   Branch
t-3a7f   codex    DONE    3m 47s    45,210   feat/quote-tests
t-b201   gemini   DONE    47s       3,201    —
t-c8e9   opencode DONE    1m 02s    FREE     feat/type-annotations
t-d4f1   codex    RUN     2m 13s    ~28,000  feat/dodo-calldata
t-e5g2   codex    FAIL    1m 30s    22,105   fix/parse-error
```

### `aid usage` — Cost and budget visibility

```bash
aid usage
```

Shows:

- task-history usage by agent
- configured budget windows from `~/.aid/config.toml`
- external counters for tools such as Claude Code

### `aid config` — Manage agent registry

```bash
aid config agents            # List detected agents
aid config agents add foo    # Register custom agent
aid config prompts           # Show prompt templates
```

## Agent Adapters

Each agent adapter encapsulates the CLI-specific quirks:

### Gemini Adapter
```
Command: gemini -o json -p "{prompt}" 2>/dev/null
Extract: jq -r '.response'
Tokens:  jq '.stats.models[].tokens.total'
Quirks:  stderr "Loaded cached credentials" must be suppressed
```

### Codex Adapter
```
Command: codex exec --json "{prompt}" -C {dir}
Stream:
  - item.started / item.completed for tool calls and agent messages
  - turn.completed for usage totals
Extract:
  - command_execution.command / aggregated_output
  - usage.input_tokens / usage.cached_input_tokens / usage.output_tokens
Prompt injection:
  - Append: "\nIMPORTANT: If no changes are needed, do NOT commit. Print 'NO_CHANGES_NEEDED: <reason>'."
  - Append: "\nCommit with message: '{conventional_commit_msg}'"
Quirks: Must capture the full JSONL stream; usage only arrives at turn completion
```

### OpenCode Adapter
```
Command: opencode run -m "{model}" --dir {dir} "{prompt}"
Models: opencode/mimo-v2-flash-free (free), opencode-go/glm-5 (cheap)
Quirks: --dir must point to git repo for file writes
```

## Watcher Engine

The watcher runs as a background thread per task, monitoring the agent's log file:

```
Log file → JSON / text classifier → Event stream → Board writer
                                             → SQLite store
```

**Patterns detected:**
| Signal | Event | Agent |
|--------|-------|-------|
| `item.started` + `command_execution.command` | Tool, build, test, commit | codex |
| `item.completed` + `agent_message.text` | Reasoning | codex |
| `turn.completed.usage.*_tokens` | Completion usage | codex |
| `test result:` / `Finished` / `error[` | Build, test, error | codex, opencode |
| JSON `.stats.models[].tokens.total` | Completion usage | gemini |
| `NO_CHANGES_NEEDED` | No-op detected | codex |

## Storage

SQLite at `~/.aid/aid.db`:

```sql
CREATE TABLE tasks (
    id TEXT PRIMARY KEY,          -- t-xxxx
    agent TEXT NOT NULL,           -- gemini, codex, opencode
    prompt TEXT NOT NULL,
    status TEXT DEFAULT 'pending', -- pending, running, done, failed
    parent_task_id TEXT,
    caller_kind TEXT,
    caller_session_id TEXT,
    worktree_path TEXT,
    worktree_branch TEXT,
    log_path TEXT,
    output_path TEXT,
    tokens INTEGER,
    duration_ms INTEGER,
    model TEXT,
    cost_usd REAL,
    created_at DATETIME,
    completed_at DATETIME
);

CREATE TABLE events (
    task_id TEXT REFERENCES tasks(id),
    timestamp DATETIME,
    event_type TEXT,  -- tool_call, reasoning, build, test, commit, completion, error
    detail TEXT,
    metadata TEXT
);
```

## Budget Config

`aid usage` reads `~/.aid/config.toml`:

```toml
[[usage.budget]]
name = "codex-dev"
agent = "codex"
window = "24h"
token_limit = 1000000
cost_limit_usd = 15.0

[[usage.budget]]
name = "claude-code"
plan = "max"
window = "5h"
request_limit = 200
external_requests = 120
```

## Prompt Templates

Built-in templates that get injected per agent:

```toml
[templates.codex.guard]
append = """
IMPORTANT: If no changes are needed, do NOT create an empty commit.
Instead, print 'NO_CHANGES_NEEDED: <reason>' and exit."""

[templates.codex.commit]
append = "Commit with message: '{msg}'"

[templates.codex.verify]
append = "After changes, run: {cmd}. Fix any errors before committing."
```

## Technology Choice

**Rust + tokio + clap** recommended because:
- Single static binary, no runtime deps
- tokio for async process management + file watching
- clap for CLI parsing (derive mode)
- rusqlite for embedded storage
- ratatui for optional TUI dashboard (`aid watch`)
- Same ecosystem as the user's main projects

**Alternative**: Go would also work well (goroutines, cobra CLI, single binary).

## File Structure

```
ai-dispatch/
├── Cargo.toml
├── DESIGN.md              ← this file
├── src/
│   ├── main.rs            ← clap CLI entry point
│   ├── agent/
│   │   ├── mod.rs         ← Agent trait
│   │   ├── gemini.rs      ← Gemini adapter
│   │   ├── codex.rs       ← Codex adapter
│   │   └── opencode.rs    ← OpenCode adapter
│   ├── watcher.rs         ← Log watcher engine
│   ├── store.rs           ← SQLite task/event store
│   ├── board.rs           ← Board rendering (text + TUI)
│   └── templates.rs       ← Prompt template engine
├── templates/
│   └── default.toml       ← Default prompt templates
└── tests/
    ├── gemini_adapter.rs
    ├── codex_adapter.rs
    └── watcher_test.rs
```

## MVP Scope (v0.1)

1. `aid run gemini/codex` — dispatch with correct flags + log capture
2. `aid watch` — text-mode progress board (no TUI yet)
3. `aid board` — list all tasks with status
4. `aid audit <id>` — show task details + git diff
5. Watcher: codex log patterns only
6. SQLite storage
7. Built-in codex prompt templates (no-op guard, commit format)

## Roadmap

### v8.0 — Programmable Orchestration (done)
- Pre-dispatch validation, structured diff in `aid show --json`, loop detection, scaled-up batch docs

### v8.1 — Model-Level Scoring (done)
- Per-model capability scoring (`capability: f64` on AgentModel), model-weighted selection, config display with cap scores, workgroup sort + creator

### v8.2 — Custom IDs, Scope & Cursor Tiers (done)
- Custom readable IDs: `--id` on `aid run` and `aid group create`, `id`/`group_id` fields in batch TOML
- Work scope verification: `--scope` flag and batch `scope` field for post-completion file boundary checking
- Cursor CLI 3-tier model support: auto (cheap), composer-1.5 (standard), opus-4.6-thinking/gpt-5.4-high (premium)
- Cursor headless mode: `--force` and `--stream-partial-output` flags, model metadata extraction, cost tracking

### v8.3 — Live Task Control (done)
- `aid stop <id>` / `aid kill <id>` — graceful (SIGTERM + 5s wait + SIGKILL) and forced task termination
- `aid steer <id> "message"` — inject guidance into running PTY tasks via file-based signal
- `Stopped` status with full integration: TUI, board, webhook, batch validation, summary

### v8.4 — Platform Evolution (next)
- `aid store publish` — publish local agent/skill packages to community store
- Daemon mode — `aid daemon` as persistent service via Unix socket
- Agent protocol v2 — unified structured event protocol for all agents