rho-cli 0.1.28

Rho CLI tools for encrypted agent collaboration, dataset publishing, controlled runs, and result release workflows
Documentation
<div align="center">

<img src="docs/assets/rho-icon.png" alt="rho" width="104" height="104">

# rho

**A partial solution to the lethal trifecta for Agents.**

A secure, decentralized network for agentic data science: collaborate
end-to-end-encrypted over git repos (and other storage) with anyone, then let
agents run tool calls against private data — without the agent ever touching the
data, the keys, or the network it could leak through.

[![crates.io](https://img.shields.io/crates/v/rho-cli?logo=rust&color=E05D44)](https://crates.io/crates/rho-cli)
[![CI](https://github.com/madhavajay/rho/actions/workflows/ci.yml/badge.svg)](https://github.com/madhavajay/rho/actions/workflows/ci.yml)
[![License: Apache-2.0](https://img.shields.io/badge/license-Apache--2.0-blue.svg)](./LICENSE)
![Platforms](https://img.shields.io/badge/platform-macOS%20%7C%20Linux%20%7C%20Windows-555)

</div>

---

## rho = Pi + Gondolin + Nostr + Git

Modern agents are dangerous in exactly the situation data science needs them:
pointed at **private data**, fed **untrusted content** (a collaborator's code, a
dataset description, a web page), and handed the **ability to communicate
outward**. Simon Willison calls that combination the
[**lethal trifecta**](https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/) —
any one of the three is fine; together they let an attacker turn "summarize this"
into "exfiltrate everything."

rho doesn't pretend prompt injection is solved. Instead it **breaks the
trifecta apart** and wires four proven pieces together so the agent never holds
all three capabilities at once:

| Piece | Role in rho |
|---|---|
| 🤖 **[Pi]https://pi.dev** | The agent harness. Plans, reads allowed files, writes code, and *proposes* tool calls — but is treated as untrusted and can never self-approve a protected action. |
|**[Gondolin]https://github.com/earendil-works/gondolin** | Local Linux micro-VM sandbox. Protected code runs here with host-side, default-deny network and read-only data mounts — JavaScript-programmable policy, not the honor system. |
| 🔑 **[Nostr]https://nostr.com** | Decentralized identity + messaging. Every account has a permanent `id/rho/…` and a Nostr controller key; encrypted messages and signed records replicate across relays, no central server required. |
| 🌳 **[Git]https://git-scm.com** | The collaboration substrate. Repos are the workspace, **files are the protocol**, and everything sensitive is encrypted *in the git objects* while staying readable in your working tree. |

The result: collaborators share work over ordinary git repos, the agent does the
thinking on **mock data**, and the one moment real data is touched happens in a
sandbox, behind a deterministic host check, after an explicit human grant.

## How it works

rho's central bet is a hard split between **planning** and **protected
execution**:

```mermaid
flowchart LR
    subgraph U[Collaborator · untrusted]
      A[Pi agent] -->|writes code +<br/>signed request| R[encrypted request<br/>in git PR]
    end
    R --> H
    subgraph O[Owner · host-controlled]
      H{deterministic<br/>validation} -->|sig ✓ · digest ✓<br/>· policy ✓| G[⧉ Gondolin<br/>micro-VM]
      H -.->|reject| X[no run]
      G -->|real data<br/>read-only| OUT[result]
    end
    OUT -->|encrypted to<br/>requester| REL[release in git]
```

- **Agents propose, they don't decide.** A collaborator's agent may write code
  and emit a *signed request* — it cannot grant itself access, run against real
  data, or release an output. Those are deterministic host actions plus optional
  human approval.
- **Protected tools only ever run in one place.** `rho run` is the single trust
  boundary: it validates the request signature and code digest, checks policy,
  then executes inside Gondolin with a fixed environment, synthetic DNS, and a
  default-deny network. The agent that wrote the code is nowhere near it.
- **Files are the protocol.** Identities, permissions, requests, approval grants,
  run receipts, and results are all inspectable, versioned files. Sensitive ones
  are encrypted with [`age`]https://github.com/FiloSottile/age-based recipient
  envelopes via git clean/smudge filters — the ciphertext lives in git, the
  plaintext only in authorized working trees.

### Twins: mock data for thinking, real data for answers

Every dataset is a **twin** — a private `real` side that never leaves the owner's
machine, and a `mock` side that's committed to the repo. Collaborators (and their
agents) develop and test against the mock; the real side is mounted read-only,
inside the sandbox, only after the owner grants the exact action and input
hashes. Mock generation preserves shape and semantics while minimizing leakage
from the source.

```text
datasets/prices/
  dataset.yaml          # name, uuid, schema — committed
  mock/prices-mock.csv  # shareable twin — committed
~/rho/alice/.../private/prices/real/prices-real.csv   # never committed
```

## Quickstart

**CLI** — straight from crates.io, or from source:

```sh
# from crates.io
cargo install rho-cli --bin rho

# or from source (Rust toolchain required)
git clone git@github.com:madhavajay/rho.git
cd rho
./install.sh          # cargo install --path . --bin rho

rho --version
```

**Desktop app** — download from the
[Releases](https://github.com/madhavajay/rho/releases) page *(macOS / Linux /
Windows binaries coming soon)*.

Create an identity backed by your GitHub handle and a freshly generated Nostr
controller key:

```sh
rho id init --github alice --generate-ssh-key --display-name "Alice"
```

### End-to-end: from empty repo to released result

The whole collaboration lives in one git repo, mostly in one PR. `--profile`
selects which identity acts (here **alice** owns the data, **bob** collaborates).

**Stage 1 — Owner creates the project**

`git init` + governance + GitHub repo + initial push, all signed.

```sh
rho --profile alice repo create alice/genomics --public --yes
```

**Stage 2 — Collaborator joins, owner admits**

Bob opens a join PR carrying his public identity; alice admits him on the same
PR and merges.

```sh
rho --profile bob   repo join alice/genomics --pr
rho --profile alice repo admit-pr 1 --pr
rho --profile alice repo merge-pr 1 --merge --delete-branch
```

**Stage 3 — Owner publishes a twin dataset**

The private `real` side stays local; the `mock` side is committed for everyone
to develop against.

```sh
rho --profile alice dataset --name prices \
  --real data/private/prices-real.csv \
  --mock data/mock/prices-mock.csv
rho --profile alice publish alice <uuid>
```

**Stage 4 — Collaborator requests a run**

Bob (or his agent) writes code against the mock, then submits a *signed* request
for a real-data run — opening a PR. He cannot run it himself.

```sh
rho --profile bob request submit-run req-001 alice/genomics \
  --to alice --tool run_real --dataset prices \
  --code workspace/sum_prices.py \
  --command "python3 sum_prices.py DATASET_CSV" --tier real --pr
```

**Stage 5 — Owner approves and runs in the sandbox**

alice's host validates the signature and code digest, then executes inside
Gondolin against the real data — results pushed back to the same PR.

```sh
rho --profile alice run approve-pr 3 --runner gondolin --pr
```

**Stage 6 — Owner releases, collaborator verifies**

The result is encrypted to bob and released; bob verifies the full chain from
request → grant → receipt → result.

```sh
rho --profile alice result release req-001 --to bob --pr
rho --profile bob   result verify req-001
```

Nothing proprietary lands in history: every stage is plain `git` plus signed,
encrypted files. A reviewer can read each governance change in the diff.

## Commands

| Group | Commands |
|---|---|
| **Identity** | `rho id init · show · export · import · list · verify-github` |
| **Repo & collaboration** | `rho repo create · join · admit-pr · merge-pr · sync · doctor · protect-path · install-filters · create-pr` |
| **Data (twins)** | `rho dataset --name … · set --public · bind · list · remove` · `rho publish` |
| **Requests & runs** | `rho request submit-run · pending · review` · `rho run approve-pr · proposal-action · grant-action · controlled-action · status` |
| **Results** | `rho result release · release-pr · verify` |
| **Crypto** | `rho crypto sign · verify · view` |
| **Repo plumbing** | `rho status · commit · gh · env · version` |

Every command takes `--profile <identity>` for multi-identity work and aims to
infer the rest (root, `--from`, SSH key, `gh` account) from context. Run
`rho --help` for the grouped list, or `rho repo doctor` to validate a checkout.

## Desktop app

rho also ships as a [Tauri desktop app](desktop) (macOS / Linux / Windows) that
drives the same `rho_core` engine directly — no CLI shell-out. It collapses the
flow above to **Add** and **Create**: roles are auto-detected from repo state and
outsiders auto-join, so the cryptography and git choreography stay out of the way.

## Status

rho is early and built in the open. What works end-to-end today (covered by
scenario tests in `tests/e2e/`):

- ✅ GitHub-backed identities with Nostr controller keys, signed and exportable
- ✅ Join / admit / merge collaboration over real PRs
- ✅ Recipient-encrypted inbox paths and transparent repo-key paths via git filters
- ✅ Signed governance, approval grants, run receipts, and a verifiable result chain
- ✅ Twin datasets (real/mock), incl. Hugging Face public sources
- ✅ Sandboxed real-data runs through Gondolin with host-side network/FS policy
- ✅ Tamper-rejection across envelopes, signatures, and code digests

The roadmap — pluggable storage/transport/identity providers, a Nostr relay for
discovery and chat, and first-class `join`/`admit` ergonomics — lives in
[TODO.md](TODO.md). Design notes are in [docs/](docs/architecture/overview.md),
and the identity model is written up in [identity.md](identity.md).

## Development

```sh
./rho <command>     # run the debug build via the dev shim, e.g. ./rho status
./test.sh           # unit + scenario tests
bash tests/e2e/local-git-pi-sandbox-encrypted.sh                 # cached e2e
RHO_LOCAL_GIT_PI_LIVE=1 bash tests/e2e/local-git-pi-sandbox-encrypted.sh   # live Pi + Gondolin
```

Agent contributors: see [AGENTS.md](AGENTS.md).

## License

[Apache-2.0](./LICENSE)
</content>
</invoke>