autorize 0.2.1

Iterative-improvement harness: runs an agent CLI in sandboxed git worktrees against a scoring command, keeping improvements until a deadline fires.
autorize-0.2.1 is not a library.

autorize

autorize is a generic iterative-improvement harness. You point it at a project, a scoring command, and an agent CLI, and it runs the agent in sandboxed git worktrees against the score — keeping improvements, discarding regressions — until a deadline fires.

It generalizes the autoresearch pattern into a small Rust CLI you can point at any repo.

How it works

For each iteration, autorize:

  1. Creates a fresh git worktree off the autorize/<name> tracking branch.
  2. Builds a prompt from your program.md, the boundary rules, the last 10 iteration records, and the diff of the best iteration so far.
  3. Spawns your agent (any CLI — Claude Code, a shell script, anything) inside the worktree with a hard wall-clock budget. On timeout the whole process group gets SIGTERM, then SIGKILL after 5 s.
  4. Stages the agent's changes and rejects the iteration if its diff touches a deny_paths glob.
  5. Runs your scoring command (raw float, regex capture, or JSONPath) and compares against the best score seen so far.
  6. Better? Commits onto autorize/<name> and advances the tracking branch. Worse / no-op / denied / invalid? Discards the worktree.
  7. Appends an IterationRecord to iterations.jsonl and rewrites state.json atomically so you can Ctrl-C (or crash) at any point and autorize resume picks up cleanly.

The loop exits when the total deadline fires, max_iterations is hit, or max_consecutive_noops is reached.

Install

cargo install --path .

Linux only for v1.

Quickstart

# 1. Scaffold an experiment under .autorize/<name>/
autorize init myexp

# 2. Edit .autorize/myexp/config.toml and .autorize/myexp/program.md
#    - point `objective.command` at your scoring script
#    - point `agent.command` at your agent CLI
#    - set a deadline (`total_budget = "4h"` or `deadline = "..."`)

# 3. Commit your repo (autorize refuses dirty trees by default), then run:
autorize run myexp

# 4. Check progress from another shell:
autorize status myexp

# 5. If the loop dies, restart it:
autorize resume myexp

Subcommands

Command What it does
autorize init <name> Scaffold .autorize/<name>/{config.toml,program.md}.
autorize run <name> Run the loop until deadline / cap / noop streak.
autorize status <name> One-shot summary from state.json + iterations.jsonl.
autorize resume <name> Recover after a crash; any in-progress iter is recorded as killed and the loop continues.
autorize llms Print an exhaustive agent-targeted markdown reference (config schema, on-disk layout, IterationRecord, state machine).

autorize run accepts --allow-dirty if you need to start with uncommitted changes outside .autorize/.

Config (.autorize/<name>/config.toml)

[experiment]
name = "myexp"
description = "..."

[objective]
command   = "bash score.sh"        # prints the score to stdout
direction = "min"                  # "min" | "max"
parse     = { kind = "float" }     # or { kind = "regex", pattern = "score=([0-9.]+)" }
                                   # or { kind = "jq",    path = ".metrics.loss" }
timeout   = "60s"
fail_mode = "invalid"              # "invalid" | "worst" | "abort"

[boundaries]
allow_paths = ["src/**/*.py"]      # prompt-only in v1
deny_paths  = [".autorize/**"]     # ENFORCED via diff

[setup]    { command = "",  timeout = "5m" }
[teardown] { command = "",  timeout = "1m" }

[iteration]
budget                = "5m"
max_iterations        = 0          # 0 = unbounded
keep_worktrees        = false
max_consecutive_noops = 5

[schedule]
total_budget = "4h"                # OR (exactly one):
# deadline   = "2026-05-21T09:00:00-07:00"

[agent]
command     = "claude --print {prompt_file}"   # {prompt_file}, {workdir}, {iter}
workdir_var = "AUTORIZE_WORKDIR"
stdin       = "none"                            # "none" | "prompt"

[agent.env]
ANTHROPIC_API_KEY = "$ANTHROPIC_API_KEY"

program.md lives next to config.toml and is freeform instructions for the agent — included verbatim at the top of every prompt.

On-disk layout

<repo>/
  .autorize/<name>/
    config.toml
    program.md
    state.json             # atomic checkpoint of loop state
    iterations.jsonl       # durable append-only log
    iter-0001/
      prompt.md            # what the agent saw
      changes.diff         # captured diff
      agent.stdout
      agent.stderr
    iter-0002/
    ...

The tracking branch autorize/<name> records every merged iteration as a single commit, so git log autorize/<name> is your improvement history and git diff main..autorize/<name> is the cumulative change.

Example

See examples/pi-digits/ for an end-to-end demo where a mock agent nudges a number in value.txt toward π:

cp -r examples/pi-digits/. /tmp/pi-demo
cd /tmp/pi-demo
git init -b main
git -c user.email=a@b -c user.name=a add .
git -c user.email=a@b -c user.name=a commit -m init
autorize run pi

Status

v1 is feature-complete on Linux. Out of scope for v1: parallel iterations, Pareto scoring, web/TUI, macOS, token accounting, retry/backoff, remote storage, allow-path enforcement (allow_paths is prompt-only).

License

AGPL-3.0-or-later.