rho-cli 0.1.29

Rho CLI tools for encrypted agent collaboration, dataset publishing, controlled runs, and result release workflows
Documentation

rho

A partial solution to the lethal trifecta for Agents.

A secure, decentralized network for agentic data science: collaborate end-to-end-encrypted over git repos (and other storage) with anyone, then let agents run tool calls against private data — without the agent ever touching the data, the keys, or the network it could leak through.

crates.io CI License: Apache-2.0 Platforms


rho = Pi + Gondolin + Nostr + Git

Modern agents are dangerous in exactly the situation data science needs them: pointed at private data, fed untrusted content (a collaborator's code, a dataset description, a web page), and handed the ability to communicate outward. Simon Willison calls that combination the lethal trifecta — any one of the three is fine; together they let an attacker turn "summarize this" into "exfiltrate everything."

rho doesn't pretend prompt injection is solved. Instead it breaks the trifecta apart and wires four proven pieces together so the agent never holds all three capabilities at once:

Piece Role in rho
🤖 Pi The agent harness. Plans, reads allowed files, writes code, and proposes tool calls — but is treated as untrusted and can never self-approve a protected action.
Gondolin Local Linux micro-VM sandbox. Protected code runs here with host-side, default-deny network and read-only data mounts — JavaScript-programmable policy, not the honor system.
🔑 Nostr Decentralized identity + messaging. Every account has a permanent id/rho/… and a Nostr controller key; encrypted messages and signed records replicate across relays, no central server required.
🌳 Git The collaboration substrate. Repos are the workspace, files are the protocol, and everything sensitive is encrypted in the git objects while staying readable in your working tree.

The result: collaborators share work over ordinary git repos, the agent does the thinking on mock data, and the one moment real data is touched happens in a sandbox, behind a deterministic host check, after an explicit human grant.

How it works

rho's central bet is a hard split between planning and protected execution:

flowchart LR
    subgraph U[Collaborator · untrusted]
      A[Pi agent] -->|writes code +<br/>signed request| R[encrypted request<br/>in git PR]
    end
    R --> H
    subgraph O[Owner · host-controlled]
      H{deterministic<br/>validation} -->|sig ✓ · digest ✓<br/>· policy ✓| G[⧉ Gondolin<br/>micro-VM]
      H -.->|reject| X[no run]
      G -->|real data<br/>read-only| OUT[result]
    end
    OUT -->|encrypted to<br/>requester| REL[release in git]
  • Agents propose, they don't decide. A collaborator's agent may write code and emit a signed request — it cannot grant itself access, run against real data, or release an output. Those are deterministic host actions plus optional human approval.
  • Protected tools only ever run in one place. rho run is the single trust boundary: it validates the request signature and code digest, checks policy, then executes inside Gondolin with a fixed environment, synthetic DNS, and a default-deny network. The agent that wrote the code is nowhere near it.
  • Files are the protocol. Identities, permissions, requests, approval grants, run receipts, and results are all inspectable, versioned files. Sensitive ones are encrypted with age-based recipient envelopes via git clean/smudge filters — the ciphertext lives in git, the plaintext only in authorized working trees.

Twins: mock data for thinking, real data for answers

Every dataset is a twin — a private real side that never leaves the owner's machine, and a mock side that's committed to the repo. Collaborators (and their agents) develop and test against the mock; the real side is mounted read-only, inside the sandbox, only after the owner grants the exact action and input hashes. Mock generation preserves shape and semantics while minimizing leakage from the source.

datasets/prices/
  dataset.yaml          # name, uuid, schema — committed
  mock/prices-mock.csv  # shareable twin — committed
~/rho/alice/.../private/prices/real/prices-real.csv   # never committed

Quickstart

CLI — straight from crates.io, or from source:

# from crates.io
cargo install rho-cli --bin rho

# or from source (Rust toolchain required)
git clone git@github.com:madhavajay/rho.git
cd rho
./install.sh          # cargo install --path . --bin rho

rho --version

Desktop app — download from the Releases page (macOS / Linux / Windows binaries coming soon).

Create an identity backed by your GitHub handle and a freshly generated Nostr controller key:

rho id init --github alice --generate-ssh-key --display-name "Alice"

End-to-end: from empty repo to released result

The whole collaboration lives in one git repo, mostly in one PR. --profile selects which identity acts (here alice owns the data, bob collaborates).

Stage 1 — Owner creates the project

git init + governance + GitHub repo + initial push, all signed.

rho --profile alice repo create alice/genomics --public --yes

Stage 2 — Collaborator joins, owner admits

Bob opens a join PR carrying his public identity; alice admits him on the same PR and merges.

rho --profile bob   repo join alice/genomics --pr
rho --profile alice repo admit-pr 1 --pr
rho --profile alice repo merge-pr 1 --merge --delete-branch

Stage 3 — Owner publishes a twin dataset

The private real side stays local; the mock side is committed for everyone to develop against.

rho --profile alice dataset --name prices \
  --real data/private/prices-real.csv \
  --mock data/mock/prices-mock.csv
rho --profile alice publish alice <uuid>

Stage 4 — Collaborator requests a run

Bob (or his agent) writes code against the mock, then submits a signed request for a real-data run — opening a PR. He cannot run it himself.

rho --profile bob request submit-run req-001 alice/genomics \
  --to alice --tool run_real --dataset prices \
  --code workspace/sum_prices.py \
  --command "python3 sum_prices.py DATASET_CSV" --tier real --pr

Stage 5 — Owner approves and runs in the sandbox

alice's host validates the signature and code digest, then executes inside Gondolin against the real data — results pushed back to the same PR.

rho --profile alice run approve-pr 3 --runner gondolin --pr

Stage 6 — Owner releases, collaborator verifies

The result is encrypted to bob and released; bob verifies the full chain from request → grant → receipt → result.

rho --profile alice result release req-001 --to bob --pr
rho --profile bob   result verify req-001

Nothing proprietary lands in history: every stage is plain git plus signed, encrypted files. A reviewer can read each governance change in the diff.

Commands

Group Commands
Identity rho id init · show · export · import · list · verify-github
Repo & collaboration rho repo create · join · admit-pr · merge-pr · sync · doctor · protect-path · install-filters · create-pr
Data (twins) rho dataset --name … · set --public · bind · list · remove · rho publish
Requests & runs rho request submit-run · pending · review · rho run approve-pr · proposal-action · grant-action · controlled-action · status
Results rho result release · release-pr · verify
Crypto rho crypto sign · verify · view
Repo plumbing rho status · commit · gh · env · version

Every command takes --profile <identity> for multi-identity work and aims to infer the rest (root, --from, SSH key, gh account) from context. Run rho --help for the grouped list, or rho repo doctor to validate a checkout.

Desktop app

rho also ships as a Tauri desktop app (macOS / Linux / Windows) that drives the same rho_core engine directly — no CLI shell-out. It collapses the flow above to Add and Create: roles are auto-detected from repo state and outsiders auto-join, so the cryptography and git choreography stay out of the way.

Status

rho is early and built in the open. What works end-to-end today (covered by scenario tests in tests/e2e/):

  • ✅ GitHub-backed identities with Nostr controller keys, signed and exportable
  • ✅ Join / admit / merge collaboration over real PRs
  • ✅ Recipient-encrypted inbox paths and transparent repo-key paths via git filters
  • ✅ Signed governance, approval grants, run receipts, and a verifiable result chain
  • ✅ Twin datasets (real/mock), incl. Hugging Face public sources
  • ✅ Sandboxed real-data runs through Gondolin with host-side network/FS policy
  • ✅ Tamper-rejection across envelopes, signatures, and code digests

The roadmap — pluggable storage/transport/identity providers, a Nostr relay for discovery and chat, and first-class join/admit ergonomics — lives in TODO.md. Design notes are in docs/, and the identity model is written up in identity.md.

Development

./rho <command>     # run the debug build via the dev shim, e.g. ./rho status
./test.sh           # unit + scenario tests
bash tests/e2e/local-git-pi-sandbox-encrypted.sh                 # cached e2e
RHO_LOCAL_GIT_PI_LIVE=1 bash tests/e2e/local-git-pi-sandbox-encrypted.sh   # live Pi + Gondolin

Agent contributors: see AGENTS.md.

License

Apache-2.0