rho-cli 0.1.22

Rho CLI tools for encrypted agent collaboration, dataset publishing, controlled runs, and result release workflows
Documentation
# Rho TODO

This file is the single tracker for Rho improvement work. It consolidates the
completed v1 work, the remaining unimplemented or deferred work, and the review
notes that were previously split across `improvements.md` and
`improvements2.md`.

## Done In V1

### Workflow Commands

- [x] Add `rho repo join` for participant export, membership update, branch
  creation, and next-step guidance.
- [x] Add `rho repo admit` for participant admission, encrypted inbox policy,
  filter installation, doctor checks, branch creation, and next-step guidance.
- [x] Add first-class request review, approve-request, request submission,
  message send, result release, result verification, and flow doctor workflows.
- [x] Add `rho repo create-pr` and keep GitHub-specific PR creation in the
  provider layer.

### Governance And Provenance

- [x] Sign governance files with the owner key and verify governance by default
  in `rho repo doctor`.
- [x] Commit signed approval grants.
- [x] Commit signed run receipts.
- [x] Verify the collaborator-side result chain from signed request through
  approval grant, run receipt, result signature, and encrypted result payload.
- [x] Commit real-input digest evidence without leaking local private input
  paths.
- [x] Enforce key revocations in repo-aware verification paths.

### Protected Paths And Policy

- [x] Represent recipient-encrypted paths in `rho/policy/permissions.yaml`.
- [x] Represent repo-key transparent paths in `rho/policy/permissions.yaml`
  with `read.repo_key: true`.
- [x] Generate managed `.gitattributes` filter entries from permissions policy.
- [x] Share segment-aware protected-path glob matching across policy, Git
  filters, doctor, and audit checks.
- [x] Expand protected-path and explicit-bundle recipients to include sender
  and admins where required.

### Crypto And Tamper Checks

- [x] Bind recipient envelopes to repo, path, purpose, and request metadata.
- [x] Add negative coverage for transparent repo-key envelope ciphertext
  mutation.
- [x] Add negative coverage for recipient-envelope ciphertext mutation,
  non-recipient open, and context mismatch.
- [x] Make Git-filter smudge reject tampered ciphertext for active recipients.
- [x] Make request review, message verification, and result verification reject
  tampered protected envelopes before continuing.

### Payloads, Manifests, And Local Safety

- [x] Store file-backed result payloads as sidecar attachments instead of large
  inline YAML values.
- [x] Allow public-only and mock-only dataset manifests so a request can target
  a public/mock interface before a private real twin exists.
- [x] Add `rho dataset set <name> --public <source>` for normal dataset editing
  of external public variants, including Hugging Face dataset sources.
- [x] Add profile-global `rho dataset bind`, `rho dataset list`, and
  `rho dataset remove` commands for reusable local private dataset bindings.
- [x] Use name-based dataset bundle paths by default while keeping UUIDs inside
  manifests.
- [x] Harden encrypted tar extraction against hostile paths and links.
- [x] Remove the non-cryptographic digest fallback.
- [x] Remove shared primitive shell-outs for `date`, `uuidgen`, `shasum`, and
  `file`.
- [x] Replace remaining shared-root line-based YAML helpers with typed
  manifests for request and approval paths.
- [x] Parse encrypted-envelope detection from the top-level YAML `kind`.
- [x] Parse repo-key fallback naming from `rho/repo.yaml` as a typed manifest.

### Repository Layout And Migration

- [x] Use deterministic GitHub inbox paths under
  `rho/messages/inbox/id/github/<handle>/...`.
- [x] Move non-migration e2e fixtures and examples to canonical inbox paths.
- [x] Keep legacy handle-only inbox paths only in migration tests and migration
  documentation.
- [x] Add `rho repo migrate-inbox-paths`.
- [x] Add doctor migration advisories for legacy inbox paths.
- [x] Remove the unused `transcrypt` submodule.

### Tests

- [x] Cover generated `.gitattributes` patterns against the shared matcher.
- [x] Cover transparent repo-key crypto, including `repo doctor` validation.
- [x] Cover recipient auto-encryption.
- [x] Cover encryption audit.
- [x] Cover protected-path refresh.
- [x] Cover key revocation.
- [x] Cover join/admit.
- [x] Cover result release and verification.
- [x] Cover local Git encrypted collaboration.
- [x] Cover local Git Pi sandbox encrypted collaboration.
- [x] Cover the PR fixture flow.

## Unimplemented / Deferred

These items are consolidated from `improvements.md` and `improvements2.md`.
They are intentionally unchecked.

- [ ] Migrate the encryption format to `age` while preserving application-level
  context binding for repo id, request id, recipient id, path, and purpose.
  This should replace the hand-rolled X25519 -> HKDF-SHA256 ->
  ChaCha20Poly1305 recipient envelope and the long-lived repo-key transparent
  envelope with `age` (`age-encryption.org/v1`). Carry the existing tamper,
  non-recipient, and context-mismatch negative tests across the swap.
- [ ] Add participant self-signatures when a concrete multi-provider identity
  flow needs participant-originated attestations in addition to owner governance
  signatures.
  When added, `rho repo doctor --governance` should verify both the owner
  signature and the participant self-signature on each bundle.
- [ ] Add provider modules beyond GitHub instead of extending workflow commands
  with provider-specific logic.
- [ ] Fix recipient-envelope clean-filter churn so re-adding a refreshed
  plaintext file does not appear modified solely because it resealed with fresh
  randomness.
- [ ] Decide whether deterministic resealing, a filter protocol change, or a
  higher-level refresh workflow is the right fix for clean-filter churn.
  If deterministic resealing is considered, verify it does not weaken AEAD
  nonce-uniqueness guarantees; the `age` migration may change the right answer.
- [ ] Add richer tool-specific schemas for inputs, outputs, and runner
  capabilities beyond simple argv templates.
- [ ] Add a migration command for older duplicated role/admin state, with
  `--dry-run`.
- [ ] Make `rho repo doctor` emit non-fatal migration advisories for legacy
  duplicated role/admin state.
- [ ] Migrate older live rehearsal repos that predate committed approval grants,
  run receipts, and result provenance artifacts.
- [ ] Continue normal typed-manifest cleanup outside the shared-root trust paths
  already converted.
- [ ] Add `rho dataset fetch <name> --variant <public|mock|real>` to materialize
  external dataset sources into `.rho/external/datasets/...` on demand.
- [ ] Add dataset interface/schema checks so a private `real` twin can be
  verified against the public or mock variant it claims to match.
- [ ] Add per-project dataset binding overrides if profile-global bindings are
  not specific enough for a future workflow.
- [ ] Add more external dataset source kinds beyond the first Hugging Face
  manifest support, including plain Git/HTTP object sources.
- [ ] Review remaining intentional process calls and either document or replace
  them: `curl`, `chmod`, and `python3`.
- [ ] Replace `chmod +x` process execution with `std::fs::set_permissions`.
- [ ] Replace the GitHub profile-proof `curl` call with an HTTP crate or
  document `curl` as an intentional integration point.
- [ ] Replace production-path `unwrap()` calls with `RhoResult` errors where
  malformed input can currently panic.
- [ ] Add a Rust source-size lint, modeled as a custom integration test wired
  into `lint.sh`, that fails production Rust source files over 500 lines so
  large modules are refactored before they keep growing.
- [x] Consolidate the split improvement docs into this single `TODO.md`.

## Review Notes

These notes came from the follow-up review that was previously in
`improvements2.md`.

### Verified Claims

The v1 completion claim is defensible as worded: the scope is self-defined and
the implemented items are real, not aspirational.

| Claim | Reality in code |
|---|---|
| CR1: `file_digest` is in-process SHA-256, no `fallback-` path | `lib.rs` streams via `sha2`; tests assert no fallback prefix |
| CR2: `date`/`uuidgen`/`file`/`shasum` shell-outs removed | `now_rfc3339`, `uuid_like`, and `mime_type` are in-process |
| CR4: `path_matches_pattern` is segment-aware glob | `lib.rs` has a recursive matcher, not the old trailing-`*` hack |
| Context binding folds into HKDF `info` | stripping context breaks decryption, not just signature verification |
| e2e coverage exists for the listed flows | the `rho-*.sh` e2e scripts cover the claimed features |

### Pushback To Keep Visible

- The phrase "100% complete / no blockers" is only accurate for the chosen v1
  scope. The unchecked items above remain real post-v1 work.
- Recipient-envelope clean-filter churn is a UX/correctness gap that may deserve
  promotion out of post-v1.
- Context binding is sound where high-level commands pass expected context, but
  context-free verification remains a risk if future commands forget to pass the
  expected context.
- Revocation checks are repo-aware; future verification paths should avoid
  silently skipping revocation when `--repo-root` is available.
- Remaining process calls should either be deliberate integration points or
  removed. Current review targets are `curl`, `chmod`, and `python3`.

### Solid Architecture To Preserve

- Approval authority is bound to an owner-signed in-repo grant; merged PRs are
  transport, not authority.
- The auditable provenance chain is:
  `request -> code digest -> runner -> input commitment -> output digest -> result`.
- Tar extraction is fail-closed with hostile fixture coverage.
- Context binding is part of the signed payload and key derivation where it is
  applied.