omk 0.5.0

A Rust runtime for Kimi CLI. Turns prompts into proof-backed engineering runs with gates, worktrees, and replay.
Documentation
# OMK Roadmap

This roadmap tracks the path from the current Wire-first beta MVP to the
`omk goal` autonomous engineering runtime.

## North Star

```bash
omk goal run "Build or transform this project until it is proof-backed ready" --until-ready
```

The system should plan, research, spawn agents, assign tasks, verify results,
recover from failures, and stop only with a proof-backed terminal status.
With explicit delivery policy, the north star extends to one-command,
end-to-end repository delivery: task-scoped worktrees, branches, PRs,
review/fix loops, cleanup, integrator PR, and gated merge into the protected
baseline.
The first interface stays terminal-native/TUI-first: fast install, one command,
live textual orchestration updates, and no graphical UI dependency.

Positioning is locked in `docs/COMPETITIVE_POSITIONING.md`: OMK is a local,
repo-native, proof-driven autonomous software engineering runtime, not a hosted
agent clone, visual app builder, or IDE chat product.

## Stage 0 - Current Foundation

Status: current beta MVP.

- Kimi-native asset sync, install, doctor, rollback.
- Scheduler-backed `omk team run`.
- Wire worker runtime.
- Event logs.
- Proof and failure artifacts.
- Run/proof/HUD inspection.
- Verification gates.
- `omk goal` durable scaffold — state, planning artifacts, validated task
  graph with retry/lease metadata, bounded Wire-backed agent waves with
  policy-validated follow-ups and worker-pool caps, post-mutation gate reruns,
  controller review/security evidence, pause/resume/cancel with worker
  interruption, budget enforcement and recovery, deterministic replay, explicit
  local integrator accept/reject, oracle evidence, and honest ready/not-ready
  proof.
- GitHub CI and coverage.

## Stage 1 - Goal State Core

Target: make goals durable and inspectable.

- Add `<omk-state-dir>/goals/<goal-id>/` state layout (XDG: `~/.local/state/omk/goals/`, legacy: `~/.omk/state/goals/`).
- Add `omk goal run/status/show/list/proof/replay/budget/verify/execute/review/pause/resume/cancel`.
- Persist normalized goal, constraints, budgets, and terminal criteria.
- Emit goal lifecycle events.
- Write `failure.json` for blocked or failed goals.
- Add JSON and Markdown output for `goal show`.
- Add backward-compatible reads for restored or older goal state.

Exit criteria:

- A goal can be created, inspected, replayed, budget-checked, verified locally,
  cancelled, and resumed after process restart.
- State transitions have tests.

## Stage 2 - Planning and Oracle

Target: make goals testable before execution.

- Generate PRD or goal brief.
- Generate technical plan.
- Generate test spec.
- Generate decision log.
- Build task graph with dependencies plus read/write sets.
- Define the oracle for completion.
- Block execution when the oracle is missing.

Exit criteria:

- Greenfield and rewrite fixture goals produce different oracle shapes.
- `blocked_on_human` is used when success criteria are ambiguous.

## Stage 3 - Agent Orchestration

Target: let the goal controller create and manage work.

- Launch role-specific agents through existing team/runtime surfaces.
- Land the first bounded `goal-agent-execute` wave on existing scheduler/Wire
  primitives.
- Capture project mutation diff and changed-file evidence from the agent wave.
- Rerun gates after agent mutations.
- Record controller review/security evidence after the bounded agent wave.
- Allow agents to propose tasks.
- Require controller validation before mutating the task graph.
- Track heartbeats, leases, retries, stale work, and task ownership.
- Support bounded concurrency and cost/time budgets.

Exit criteria:

- A goal can execute multiple dependent tasks.
- Failed tasks retry or produce explicit proof evidence.

## Stage 4 - Verification Wall

Target: make readiness proof-backed.

- Run configured gates by project type.
- Capture command evidence and artifacts.
- Add compatibility/golden gates for rewrite goals.
- Add security and dependency gates for hardening goals.
- Add benchmark gates for performance goals.

Exit criteria:

- `ready` cannot be emitted while required gates are failing.
- `ready` requires oracle evidence matching the goal class.
- `not_ready` includes the failing evidence.

## Stage 5 - Worktree and Integration Flow

Target: make parallel work safe.

- Treat `master` / `main` as read-only baselines; all slices land through PRs.
- Use task-scoped worktrees and branches as the canonical coordination path.
  GitHub PRs carry proof evidence, write scopes, and verification wall output
  for human, Codex, Kimi, Claude, and future `omk goal` workers.
- Create isolated worktrees for independent task slices; fall back to branches
  when worktree creation is not possible.
- Record one task per slice with owner, write scope, dependencies, gates,
  branch, and PR link.
- Merge accepted slices through an integrator task. Current local integrator
  acceptance is explicit through `omk goal accept`; rejection remains visible
  through `omk goal reject`.
- Detect access conflicts before dispatch. Initial agent-proposed follow-up
  conflicts, including normalized, parent/child, and read/write path overlaps,
  are now rejected unless dependency ordering serializes the access.
- Support partial acceptance of completed subgoals.
- Preserve changelog and docs updates during integration.

Exit criteria:

- Two independent slices can run concurrently and integrate deterministically.
- Conflicting read/write access sets block dispatch or require a plan change.
- Every integrated slice has a task id, branch, PR, and verification evidence.

## Stage 6 - Self-Review and Hardening

Target: move from useful automation to autonomous engineering quality.

- Add specialist reviewer, security, performance, and test-engineer loops.
- Add "break it" challenge passes.
- Add anti-slop cleanup pass.
- Add dependency rationale checks.
- Add threat-model artifact for security-sensitive goals.

Exit criteria:

- A goal proof records independent review results.
- Known gaps are explicit and cannot be hidden by a final summary.

## Stage 7 - GitHub Output

Target: turn long-running goals into reviewable and mergeable delivery
artifacts.

- Generate PRs from task-scoped branches instead of writing to `master` /
  `main`.
- Render a PR or draft PR body from a goal result without implicit network
  mutation.
- Attach proof summary to PR body.
- Attach task id, owner, write scope, and verification wall output.
- Link changed files, gates, known gaps, and decisions.
- Support release-candidate output for GitHub-only releases.
- Under explicit `auto-pr` / `gated` policy, run the whole delivery loop:
  slice PRs, review/fix iterations, integrator PR, CI/proof gates, and final
  merge into the protected baseline.

Exit criteria:

- `omk goal open-pr latest --dry-run` creates a reviewable PR draft with proof
  evidence.
- `omk goal` can map accepted task graph nodes to branches and PR links.
- End-to-end mode records every created PR, review, fix task, merge decision,
  integrator result, and final baseline commit in `proof.json`.

## Stage 8 - Long-Horizon Reliability

Target: let goals run for days safely.

- Harden pause/resume across active worker interruption and machine restarts. Current runtime now persists pause/resume across separate CLI invocations and interrupts active Wire-backed goal workers when an operator pauses or cancels a goal.
- Harden goal replay into deterministic crash-recovery replay. Current replay JSON is stable across separate CLI invocations because replay timestamps come from persisted goal event evidence.
- Harden remaining multi-day budget controls beyond the current wall-clock `--budget-time`, Wire-derived `--budget-tokens` / `--budget-usd`, `budget-add` recovery path, and Wire worker per-task budget timeout.
- Quarantine stale workers after expired leases. Current scheduler cleanup writes a durable marker, emits `worker_dead` evidence, prevents later stale-worker dispatch/outbox reuse, and fails fast when all workers are dead.
- Add crash recovery tests.
- Add operator notifications.

Exit criteria:

- A multi-hour fixture goal can survive process restart and continue.
- Operators can answer "what is it doing?" without reading raw logs.

## Not Yet

These are deliberately out of early scope:

- Guaranteed production-ready output for arbitrary underspecified ideas.
- Unbounded recursive agent spawning.
- Automatic paid API or infrastructure provisioning.
- Rewriting very large projects without first building compatibility oracles.
- Silent force-push or destructive repository operations.