# flocks
Parallel AI coding agent orchestrator. Dispatches `claude -p` tasks to flocks of agents in isolated kap devcontainers with load balancing across subscriptions.
## Commands
```
cargo check # fast compile check
cargo test # run all tests (64 currently)
cargo run -- --help # CLI help
cargo run -- init # scaffold flocks.toml
cargo run -- init --project /path # scaffold with specific kap project
cargo run --release -- run --tasks tasks.jsonl # dispatch tasks
cargo run --release -- run --tasks tasks.jsonl --supervise # dispatch + auto-retry loop
cargo run -- status # show agent status (grouped, with elapsed times + stall warnings)
cargo run -- logs <task-id> # show agent output log
cargo run -- logs <task-id> -f # follow log (tail -f)
cargo run -- logs <task-id> -n 50 # last 50 lines
cargo run -- retry --all-failed # retry failed tasks with prior context
cargo run -- retry --pick fix-123 # retry specific task
```
## Architecture
Single Rust binary (tokio, clap). kap is a library dependency at `../kap` (path dep — both repos must be siblings at `~/oss/`).
```
src/main.rs — CLI: up, run, status, stop, land, down
src/config.rs — flocks.toml parsing (containers, sources, validation, workspace)
src/task.rs — JSONL task loading + script-based discovery
src/dispatch.rs — dispatch loop, container load balancer, worktree lifecycle, kap exec, state persistence
```
### Data flow
```
flocks.toml → config → tasks (JSONL/script) → dispatch loop
→ create host worktree (.worktrees/<name>)
→ kap::container::exec(bash -lc "cd /workspace/.worktrees/<name> && claude -p '<prompt>' --dangerously-skip-permissions")
→ persist state to .flocks/state.json
→ clean up worktree on completion
```
### Key design decisions
- **Worktrees on host**, inside the project dir at `.worktrees/` so Docker mount makes them visible at `/workspace/.worktrees/` in the container
- **Login shell** (`bash -lc`) for kap exec so PATH includes mise shims and `~/.local/bin/claude`
- **Host vs container paths**: worktrees are created with host-side git, but `claude -p` runs inside the container. Mount point configurable via `workspace_mount` (default `/workspace`)
- **kap as lib dep**: `kap::container::up/down/exec` called directly, not shell-outs. kap's `container.rs` returns `Result` (no `process::exit`)
- **Stale cleanup**: `create_worktree()` prunes stale worktrees and deletes old branches before creating new ones
- **Auto .gitignore**: `create_worktree()` best-effort appends `.worktrees/` to the project's `.gitignore`
- **`claude -p` max_turns**: configurable via `max_turns` in flocks.toml (maps to `--max-turns`). Omit for unlimited.
- **Agent timeout**: `agent_timeout_secs` (default 1800 = 30 min, 0 = disabled). Uses `tokio::time::timeout` around the agent future.
- **Supervise mode**: `--supervise` runs a mechanical retry loop (dispatch → wait → retry failures with context → repeat). Like Ralph's bash loop but built-in. Failed agents get prior attempt output injected into their retry prompt.
## Config (flocks.toml)
```toml
max_agents = 5 # global concurrency cap
stagger_delay_secs = 2 # delay between agent launches
# max_turns = 50 # optional: claude -p --max-turns N
# agent_timeout_secs = 1800 # default 30 min; 0 = disabled
# max_retries = 3 # retry rounds in --supervise mode
# Credentials: API subscriptions. Agents round-robin across these.
# "default" env = use the container's existing token (no injection).
[[credentials]]
name = "sub-1"
env = "CLAUDE_TOKEN_1" # env var name to read token from
# Containers: sandboxed kap devcontainers that agents run in.
[[containers]]
name = "local"
max_worktrees = 3 # agents per container
kap_project = "/Users/peter/oss/nitrocop"
# workspace_mount = "/workspace" # default; container mount path
[source]
type = "jsonl" # or "script"
path = "tasks.jsonl"
# discover = "python3 generate_tasks.py" # for type = "script"
[workspace]
branch_prefix = "flocks/"
land_target = "main"
```
## Testing
Unit tests: `cargo test` (26 tests across config, task, dispatch).
End-to-end: requires a running kap container. See PLAN.md "End-to-end testing" section for step-by-step instructions. Summary:
1. `kap list` to verify container is running
2. Create tasks.jsonl + flocks.toml
3. `cargo run --release -- run --tasks tasks.jsonl`
4. `cargo run -- status` to check results
5. Clean up: `git worktree prune`, delete branches, `rm -rf .worktrees .flocks`
## What's implemented vs TODO
See `PLAN.md` for the detailed backlog. Key gaps:
- No TUI dashboard (ratatui live view)