omk 0.5.0 - Docs.rs

# Troubleshooting Guide

> For goal-specific recovery procedures (PR creation failures, CI blockers, merge conflicts, review wall rejections, and budget issues), see [`TROUBLESHOOTING_GOAL.md`](TROUBLESHOOTING_GOAL.md).

## Installation Issues

### `cargo install --git https://github.com/ekhodzitsky/oh-my-kimi.git` fails

Ensure Rust 1.78+ is installed:

```bash
rustc --version
```

OMK is not published to crates.io yet. Use GitHub Release assets or
`cargo install --git`.

On Linux, install the usual build dependencies if OpenSSL/pkg-config errors
appear:

```bash
sudo apt-get install pkg-config libssl-dev
```

### `omk` command not found

Check that the install location is on your `PATH`:

```bash
echo "$PATH"
ls ~/.cargo/bin/omk ~/.local/bin/omk 2>/dev/null
```

The GitHub install script handles this automatically for common shells.

## Runtime Issues

### Goal Recovery Workflow

When a goal stops unexpectedly or blocks, use this ordered recovery path:

**1. Inspect the current state:**

```bash
omk goal status latest
omk goal show latest
omk goal replay latest --format text
```

**2. If the goal is paused, resume it:**

```bash
omk goal resume latest
```

**3. If PR creation failed:**

Check delivery policy and GitHub auth:

```bash
omk goal show latest --format json | jq '.delivery_policy'
omk doctor  # verify gh CLI auth and repo push access
```

If the PR exists but OMK lost the URL, record it manually:

```bash
omk goal proof latest --format md
# Find the slice branch, open PR via gh pr create or GitHub UI
# Then resume: omk goal resume latest
```

**4. If a slice PR failed CI or review:**

Find the slice branch and review artifacts:

```bash
omk goal proof latest --format md
omk goal show latest --format json | jq '.artifacts[] | select(.kind | contains("review"))'
# Look for slice branch names like omk/goal/.../...
```

Check out the branch, fix locally, commit and push. Then re-run review:

```bash
omk goal review latest
```

If CI is flaky (not a real code failure), retry the slice:

```bash
omk goal verify latest   # re-run gates locally first
omk goal execute latest  # re-dispatch the slice agent if needed
```

If the slice is irredeemable, reject it and plan a replacement:

```bash
omk goal reject latest --reason "slice N failed security review, needs rewrite"
```

**5. If review blockers persist after multiple cycles:**

Inspect the review wall confidence and anti-slop score:

```bash
omk goal proof latest --format json | jq '.known_gaps'
# Look for anti_slop_confidence > 0.5 — this spawns a cleanup task automatically
```

If the controller keeps creating cleanup tasks but never converges, consider:
- Narrowing the slice write scope
- Adding explicit acceptance criteria to the task graph
- Accepting the blocker as a known gap with `--reason`

**6. If the integrator branch has a merge conflict:**

```bash
omk goal proof latest --format md
# Note conflict files from merge-conflict artifact
```

Manually rebase the conflicting slice branch onto the latest master:

```bash
git checkout <slice-branch>
git rebase origin/master
# Resolve conflicts, then:
git push --force-with-lease
```

Then re-verify and accept:

```bash
omk goal verify latest   # re-run gates on integrator branch
omk goal accept latest --summary "manual conflict resolution accepted"
```

**7. If partial acceptance is needed (some slices good, some bad):**

Reject the bad slices individually:

```bash
omk goal reject latest --reason "slice B failed architect review"
```

Accept the good slices and create an integrator with the subset:

```bash
omk goal accept latest --summary "accept slices A, C, D; reject B"
omk goal execute latest  # re-run integrator with remaining slices
```

**8. If the goal needs more budget:**

```bash
omk goal budget latest
omk goal budget-add latest --time 2h --tokens 100000 --usd 5
omk goal resume latest
```

**9. If workers are stale or the run looks hung:**

```bash
omk team cleanup --dry-run
omk team cleanup --older-than 1
omk goal resume latest
```

**10. If the goal is blocked on human oracle:**

Refine the goal text with testable criteria and re-run:

```bash
omk goal run "Refined goal with explicit acceptance criteria" --until-ready
```

### Kimi CLI not found

Verify from the same shell where you run OMK:

```bash
kimi --version
which kimi
omk doctor
```

Install and authenticate Kimi CLI using the official upstream docs:
https://www.kimi.com/code/docs

### Kimi auth mismatch

Workers may fail early when the `kimi` binary exists but auth is not valid for
the current shell.

```bash
kimi auth status
which kimi
omk doctor
```

Complete the Kimi login flow, then retry the team command.

### Kimi Wire initialize fails

Symptoms include parse errors before a worker turn begins.

```bash
kimi info
cargo build --bin omk
```

Compare the local protocol report with `docs/KIMI_UPSTREAM.md`. Kimi CLI
versions may add extension fields in `initialize.result`; OMK should parse those
as structured JSON evidence rather than a closed schema.

### Team run hangs or looks stuck

Start with the CLI views:

```bash
omk team health <team-name>
omk run show latest
omk proof show latest
```

Then inspect the state files:

```bash
ls ~/.local/state/omk/team/<team-name>
ls ~/.local/state/omk/team/<team-name>/workers
```

If you use the legacy state root, inspect `~/.omk/state/team/<team-name>`
instead. OMK prefers `~/.omk/state` when `~/.omk/` already exists; otherwise it
uses `~/.local/state/omk`.

### Run has no proof

Check whether the run wrote a failure artifact:

```bash
omk run show latest
omk proof show latest
find ~/.local/state/omk/team -name failure.json -o -name proof.json
```

Failed or interrupted runs should produce `failure.json`. Not-ready proof output
usually means a required verification gate failed or never ran.

### Greenfield goal proof stays `not_ready`

This is expected for the current `omk goal` MVP until all evidence exists. Check
the proof first:

```bash
omk goal show latest
omk goal proof latest --format md
omk goal replay latest --format text
```

Common missing evidence:

- no local gates ran;
- required gates failed;
- `omk goal execute latest` has not run;
- `omk goal review latest` has not attached review and security evidence;
- agent changes exist, but the integration loop has not accepted, committed, or
  opened a PR for them.

The proof distinguishes **engineering-ready evidence** from **product-ready
release acceptance**. Passing gates plus agent/review evidence means the result
is ready for engineering handoff. Product readiness still requires human
acceptance, PR/release work, and any product or positioning decisions.

### Greenfield goal has no gates

`omk goal verify` auto-detects gates from project files. A blank directory has
no reliable oracle, so the proof records the gap instead of pretending success.
For the greenfield acceptance demo, start with a tiny project fixture:

```bash
cargo new omk-goal-greenfield-demo
cd omk-goal-greenfield-demo
omk setup
omk goal run "Build a tiny local-only Rust CLI with add/list commands and tests" --max-agents 1
omk goal verify latest
```

For non-Rust projects, add a project-native manifest (`package.json`,
`pyproject.toml`, `go.mod`) or define explicit gates in `.omk/gates.toml`.

### `omk goal execute` cannot start workers

Goal execution needs a Wire-capable Kimi runtime:

```bash
kimi --version
kimi auth status
omk goal execute latest
```

For offline tests, `MOCK_KIMI` must point at an executable wire-compatible mock,
not just be set to arbitrary text:

```bash
MOCK_KIMI=/path/to/mock-kimi-wire omk goal execute latest
```

If Kimi is unavailable, `goal run`, `goal show`, `goal verify`, `goal proof`,
and `goal replay` still produce useful planning and gate artifacts, but the
proof should remain `not_ready` because bounded agent execution evidence is
missing.

### Dry-run demo tries to use real Kimi

Use the dry-run switch, not only `MOCK_KIMI`:

```bash
NORTH_STAR_DRY_RUN=1 bash scripts/north_star_demo.sh
```

Dry-run mode forces mock execution and isolated `HOME`/`XDG_*` state even when
the real `kimi` binary is installed. If `MOCK_KIMI` is set to a path, ensure it
is executable.

### GitHub delivery policy blocks PR or merge

`omk goal open-pr latest --dry-run` only renders local PR title/body evidence
in the current release. Network mutation, PR creation, and merges require an
explicit delivery policy in future gated delivery mode. Recovery today:

```bash
omk goal proof latest --format md
omk goal open-pr latest --dry-run --format markdown
```

Use the rendered body as the handoff, then create the PR manually from a
task-scoped branch.

### CI, review, or merge conflict blocks readiness

Do not accept the goal as ready while the proof names blockers. Use the proof
and replay first:

```bash
omk goal replay latest --format text
omk goal proof latest --format md
```

If CI failed, rerun the failing gate locally and attach new evidence with
`omk goal verify latest`. If review found blockers, create a focused fix task
or run another bounded execution pass before `omk goal review latest`. If merge
conflicts are unsafe to resolve automatically, keep the proof `not_ready` and
record the conflict files, branches, and manual recovery step in the PR body.

### Goal blocked on human oracle

Vague requests such as "make this app great" or "build a product users love"
can stop as `blocked_on_human`. Rewrite the goal with testable behavior,
explicit constraints, and gates:

```bash
omk goal run "Build a local-only Rust CLI named taskline with add/list commands, tasks.txt storage, command tests, no network access, and no new dependencies"
```

If the goal depends on taste, pricing, legal review, credentials, or external
business judgment, capture that as a human decision before expecting autonomous
execution to continue.

### Goal rejected by the integrator

`omk goal reject latest --reason <text>` keeps the proof `not_ready`, records
`integration_evidence.status = rejected`, and writes a rollback-plan artifact
under the goal's `artifacts/integration/` directory. Inspect it before starting
the next slice:

```bash
omk goal proof latest --json
omk goal show latest
```

The next attempt should either revert the rejected changed-file scope or replace
it in a new task-scoped branch/worktree, then rerun verify, execute, review, and
acceptance.

### Goal needs more budget

When wall-clock, token, or USD limits are exhausted, the goal status becomes
`needs_more_budget` instead of silently continuing:

```bash
omk goal budget latest
omk goal budget-add latest --time 1h
omk goal budget-add latest --tokens 500000 --usd 5
```

Budget extensions are explicit operator decisions and are recorded in
`budget-checkpoints.jsonl`.

### Kimi assets drift

If agent behavior does not match expected roles/hooks/skills:

```bash
omk kimi doctor
omk kimi sync --dry-run
omk kimi sync
```

If sync changed the wrong files:

```bash
omk kimi rollback --dry-run
omk kimi rollback
```

### Web dashboard port already in use

Use another port or inspect the existing process:

```bash
omk hud --web --port 8081
lsof -i :8080
```

## Performance Issues

### Slow team run

- Check Kimi CLI responsiveness: `time kimi --version`.
- Reduce worker count: `omk team run 1:executor "task"`.
- Check free disk space for the active state root.

```bash
df -h ~/.local/state
df -h ~/.omk/state
```

### High memory usage

- Limit concurrent teams.
- Prefer smaller worker counts until the task truly benefits from parallelism.
- Prune old state after a dry run:

```bash
omk cleanup --teams --dry-run
omk team cleanup --dry-run --older-than 7
```

## Getting Help

1. Run diagnostics: `omk doctor`
2. Check run evidence: `omk run show latest`
3. Check proof evidence: `omk proof show latest`
4. Open a GitHub issue with:
   - `omk --version`
   - `omk doctor`
   - the command you ran
   - relevant `events.jsonl`, `proof.json`, or `failure.json` excerpts