Crate harness_loop_engine

Expand description

§harness-loop-engine — loop engineering for harness-rs

Agent = Model + Harness. A harness wraps a single agent call. A loop wraps the harness: it runs that call again and again, on a cadence, with state, verification, budgets, and gates — driving toward a goal over time instead of in one shot. This crate is harness-rs’s loop layer.

“Loop engineering is replacing yourself as the person who prompts the agent. You design the system that does it instead.” — and you stay the engineer responsible for that system.

The building blocks already live elsewhere in harness-rs — scheduling (harness-scheduler), worktrees (harness-sandbox), sub-agents (harness-loop), memory (harness-core), MCP (harness-mcp). What this crate adds is the orchestration discipline that turns those parts into a loop you can trust:

LoopLevel — maturity levels L1 (report) → L2 (assisted) → L3 (unattended). A loop earns autonomy in stages.
HumanGate — the proceed-or-escalate decision, tied to the level. Built-ins: AlwaysEscalate, AllowlistGate, CallbackGate.
ActionExecutor — the side-effect handoff after a verified L3 auto-approval. Built-ins: ApprovalOnlyExecutor, CallbackActionExecutor.
TokenBudget — a per-round spend ceiling, because unattended loops spend without bound if you let them.
LoopSpec — the inert, serializable description of a loop.
LoopEngine — the runner: recall state → isolate → maker sub-agent → checker sub-agent → gate → record state.
LoopScheduler — runs loops on their cadence.
patterns — the seven named production loops (daily triage, PR babysitter, CI sweeper, …), each a ready-made LoopSpec.

§The anatomical loop

  schedule (cadence)
       │
       ▼
  recall STATE / memory ──► isolated worktree (sandbox)
       │                          │
       │                          ▼
       │                  maker sub-agent  (proposes)
       │                          │
       │                          ▼
       │                  checker sub-agent  (tests + gates)
       │                          │
       │                          ▼
       │                     human gate? ──┬─ safe/allowlisted ─► action executor
       │                                   └─ risky/ambiguous ──► escalate
       ▼                                            │
  write STATE / memory  ◄────────────────────────── recurse next tick

§Two debts to watch

Loop engineering names two failure modes that accrue silently. This crate makes them visible rather than solving them — they are engineering responsibilities, not features:

Intent debt — the drift between what a loop was meant to do and what it actually does. Antidote: LoopSpec::intent is a required, one-sentence statement of purpose, injected into every maker turn and printed in every report. Review it as the loop evolves.
Comprehension debt — the gap between what the loop ships and what humans still understand about its behaviour. Antidote: the maker/checker split, the recorded state spine, and rendered reports keep a legible trail of every round.

§Safety stance

Verification stays on you — unattended loops make unattended mistakes. Defaults are conservative: L1 makers are strictly read-only, the default gate for every level is AlwaysEscalate, and L3 auto-proceed requires an explicit AllowlistGate. Graduate a loop’s level only as you build trust in it.

Modules§

patterns: The production-loop catalogue.

Structs§

ActionReceipt: Evidence that an auto-approved action was handed off to its executor.
AllowlistGate: Auto-proceeds only for verified actions whose kind is on the allowlist, and only at L3. Everything else escalates. This is the workhorse gate for unattended loops with a narrow blast radius.
AlwaysEscalate: Never auto-proceeds — every proposal is escalated. This is the correct gate for L1 loops (and a safe default for anything you’re unsure about).
ApprovalOnlyExecutor: Default executor: records that a gate approved the action but performs no external side effect. This keeps LoopEngine::new safe while making the missing production handoff visible in the round report.
BudgetState: Running tally of spend within a round, checked against a TokenBudget.
CallbackActionExecutor: Wrap a synchronous callback as an ActionExecutor. Use this for application-specific handoffs without defining a bespoke type.
CallbackGate: Wraps an arbitrary closure as a gate — for custom policies (budget-aware, time-of-day, denylist, MCP-scope checks, …).
LoopEngine: Binds a LoopSpec to the live pieces it needs to run: a model, the maker/checker tool sets, an isolation sandbox, a gate, and (optionally) memory for the state spine.
LoopScheduler: Ticks registered loops on their cadence.
LoopSpec: Declarative definition of a single loop.
ProposedAction: A change the maker produced and the checker verified, presented to the gate for a proceed-or-escalate decision.
RoundReport: The full record of a round — the maker/checker reports, token spend, the gate decision, and the outcome. Suitable for delivery to a channel and for writing to memory.
StdoutSink: Prints deliverable reports to stdout.
TokenBudget: A declarative spend ceiling for a single round of a loop.

Enums§

ActionError
BudgetLimit: Which ceiling a round crossed.
GateDecision: The gate’s verdict for one proposed action.
LoopLevel: How much autonomy a loop is trusted with.
RoundOutcome: What one round of a loop did.

Traits§

ActionExecutor: Carries out a verified, auto-approved action.
HumanGate: Decides what happens to a verified proposal. Implementations encode the human-gate policy; the engine consults this once per round.
LoopSink: Where a finished round’s report goes. Implement this to route reports to Slack, email, a file, a tracker — anywhere. The default is stdout.

Functions§

default_gate_for: The gate a level implies when the caller doesn’t specify one.