# Rho TODO
This file is the single tracker for Rho improvement work. It consolidates the
completed v1 work, the remaining unimplemented or deferred work, and the review
notes that were previously split across `improvements.md` and
`improvements2.md`.
## Done In V1
### Workflow Commands
- [x] Add `rho repo join` for participant export, membership update, branch
creation, and next-step guidance.
- [x] Add `rho repo admit` for participant admission, encrypted inbox policy,
filter installation, doctor checks, branch creation, and next-step guidance.
- [x] Add first-class request review, approve-request, request submission,
message send, result release, result verification, and flow doctor workflows.
- [x] Add `rho repo create-pr` and keep GitHub-specific PR creation in the
provider layer.
### Governance And Provenance
- [x] Sign governance files with the owner key and verify governance by default
in `rho repo doctor`.
- [x] Commit signed approval grants.
- [x] Commit signed run receipts.
- [x] Verify the collaborator-side result chain from signed request through
approval grant, run receipt, result signature, and encrypted result payload.
- [x] Commit real-input digest evidence without leaking local private input
paths.
- [x] Enforce key revocations in repo-aware verification paths.
### Protected Paths And Policy
- [x] Represent recipient-encrypted paths in `rho/policy/permissions.yaml`.
- [x] Represent repo-key transparent paths in `rho/policy/permissions.yaml`
with `read.repo_key: true`.
- [x] Generate managed `.gitattributes` filter entries from permissions policy.
- [x] Share segment-aware protected-path glob matching across policy, Git
filters, doctor, and audit checks.
- [x] Expand protected-path and explicit-bundle recipients to include sender
and admins where required.
### Crypto And Tamper Checks
- [x] Bind recipient envelopes to repo, path, purpose, and request metadata.
- [x] Add negative coverage for transparent repo-key envelope ciphertext
mutation.
- [x] Add negative coverage for recipient-envelope ciphertext mutation,
non-recipient open, and context mismatch.
- [x] Make Git-filter smudge reject tampered ciphertext for active recipients.
- [x] Make request review, message verification, and result verification reject
tampered protected envelopes before continuing.
### Payloads, Manifests, And Local Safety
- [x] Store file-backed result payloads as sidecar attachments instead of large
inline YAML values.
- [x] Allow public-only and mock-only dataset manifests so a request can target
a public/mock interface before a private real twin exists.
- [x] Add `rho dataset set <name> --public <source>` for normal dataset editing
of external public variants, including Hugging Face dataset sources.
- [x] Add profile-global `rho dataset bind`, `rho dataset list`, and
`rho dataset remove` commands for reusable local private dataset bindings.
- [x] Use name-based dataset bundle paths by default while keeping UUIDs inside
manifests.
- [x] Harden encrypted tar extraction against hostile paths and links.
- [x] Remove the non-cryptographic digest fallback.
- [x] Remove shared primitive shell-outs for `date`, `uuidgen`, `shasum`, and
`file`.
- [x] Replace remaining shared-root line-based YAML helpers with typed
manifests for request and approval paths.
- [x] Parse encrypted-envelope detection from the top-level YAML `kind`.
- [x] Parse repo-key fallback naming from `rho/repo.yaml` as a typed manifest.
- [x] Migrate newly written recipient and transparent repo-key envelopes to
`age` while preserving recipient-envelope context binding by encrypting a
checked inner context alongside the payload, and keeping legacy envelopes
readable.
- [x] Add participant self-signatures to exported identity bundles and verify
them in `rho repo doctor`.
- [x] Fix recipient-envelope clean-filter churn by reusing a decryptable staged
index envelope when its plaintext and context already match the worktree file.
- [x] Decide the clean-filter churn approach: use index-envelope reuse as the
safe default, not deterministic resealing, because deterministic resealing
would need more careful nonce/key analysis and age does not expose an AAD or
deterministic sealing mode for this use case.
### Repository Layout And Migration
- [x] Use deterministic GitHub inbox paths under
`rho/messages/inbox/id/github/<handle>/...`.
- [x] Move non-migration e2e fixtures and examples to canonical inbox paths.
- [x] Keep legacy handle-only inbox paths only in migration tests and migration
documentation.
- [x] Add `rho repo migrate-inbox-paths`.
- [x] Add doctor migration advisories for legacy inbox paths.
- [x] Remove the unused `transcrypt` submodule.
### Tests
- [x] Cover generated `.gitattributes` patterns against the shared matcher.
- [x] Cover transparent repo-key crypto, including `repo doctor` validation.
- [x] Cover recipient auto-encryption.
- [x] Cover encryption audit.
- [x] Cover protected-path refresh.
- [x] Cover key revocation.
- [x] Cover join/admit.
- [x] Cover result release and verification.
- [x] Cover local Git encrypted collaboration.
- [x] Cover local Git Pi sandbox encrypted collaboration.
- [x] Cover the PR fixture flow.
## High Priority: Pluggable Rho Architecture
These items define the next architecture track. Git/GitHub terminology may stay
where the current workflow depends on it, but new core APIs should model the
generic Rho concepts first and let GitHub/Git implement them as one adapter.
### Core Direction
Rho should split what GitHub currently bundles into separate adapter layers:
1. Identity provider: proves who an actor is.
2. Storage backend: stores signed and encrypted Rho artifacts.
3. Discovery and transport: helps peers find each other and exchange updates.
GitHub currently acts as identity proof, storage, PR workflow, and notification
surface. Future Rho should make GitHub one provider among many.
Nostr should not replace Rho identities. The core identity namespace should stay
provider-agnostic, for example:
```text
id/github/madhavajay
id/email/me@madhavajay.com
id/orcid/0000-0002-...
id/domain/madhavajay.com
```
A Nostr-compatible key should act as a portable global controller key that can
claim, update, and replicate records about those Rho identities. In other words,
Nostr provides signing, discovery, realtime subscriptions, and redundant relay
publication for Rho identity/resource records; Rho still defines the identity
model, verification rules, policy, storage semantics, and encrypted artifact
handling.
The authority should remain in Rho artifacts:
- signed identity bundles
- signed governance files
- permissions policy
- approval grants
- run receipts
- result signatures
- encrypted payloads
- revocation records
Backends can store, announce, or move these artifacts. They should not decide
whether an action is authorized.
### First Testable Slice: Onboarding To Messaging
The first useful end-to-end test should be user onboarding, not a generic relay
benchmark. A new user should be able to create or connect a Nostr-compatible
controller key, publish their Rho account bindings to the relay, and send/receive
messages without extra setup.
- [ ] During onboarding, create or connect a Nostr-compatible controller key.
- [ ] Support both "create new Rho controller key" and "connect existing Nostr
signer" paths.
- [ ] Keep private key custody local/user-controlled. Do not require users to
paste an `nsec` into Rho.
- [ ] Bind provider-specific Rho identities such as `id/github/...`,
`id/email/...`, `id/orcid/...`, and `id/domain/...` to the controller key.
- [ ] Publish the account binding, resource-resolution record, relay hints, and
messaging inbox location to the Rho relay during onboarding.
- [ ] Publish an initial relay-list record so clients know where to read/write
that user's Rho records.
- [ ] Add a user directory backed by signed Rho identity/resource records so a
user can search by handle, display name, provider identity, or verified
organisation.
- [ ] Make directory results resolve to a trusted controller public key, current
relay hints, and messaging inbox metadata before chat is enabled.
- [ ] Show trust/verification state in directory results: self-trusted, Rho
verified, organisation verified, unverified claim, expired, or revoked.
- [ ] Let a user explicitly trust a resolved public key from the directory and
persist that trust decision in local/user-controlled Rho state.
- [x] Make basic messaging work immediately after onboarding using the relay as
the default live inbox.
- [x] Let the live chat flow start directly from a trusted directory result.
- [x] Add a local-dev onboarding fixture that creates two test identities,
publishes their bindings, searches the directory, trusts the resolved public
keys, subscribes to each other's live updates, and sends a message.
- [x] Keep the local relay blind to private chat plaintext by publishing only
encrypted chat envelopes in the smoke flow.
- [x] Replace the local smoke-test ECDH envelope with a production private
message suite. The local smoke now uses NIP-44 v2 payloads and verifies the
published NIP-44 test vector before sending live chat.
- [ ] Add NIP-17 gift wrapping for private chat metadata reduction once the base
NIP-44 encrypted payload path is wired into the product client.
- [x] Add real Nostr event signature creation and verification for onboarding
and messaging fixtures.
- [x] Treat this onboarding-to-message flow as the first acceptance test for the
relay, resolver, and identity binding model.
### Pluggable Provider, Storage, Transfer, And Crypto Foundation
- [ ] Treat GitHub as a non-privileged adapter that happens to provide identity
proof, storage, proposal workflow, and notifications in the current v1 flow.
- [ ] Keep Rho artifacts authoritative: signed identity bundles, signed
governance files, permissions policy, approval grants, run receipts, result
signatures, encrypted payloads, and revocation records.
- [ ] Make backends store, announce, or move artifacts without deciding whether
an action is authorized.
- [ ] Add provider-neutral core records for `RhoIdentity`, `RhoSpace`, and
`SignedChangeSet`.
- [ ] Add a pluggable crypto/key-suite layer before adding more provider
features. Identities should be able to advertise multiple encryption
capabilities without command code hardcoding one algorithm.
- [ ] Treat a Nostr-compatible `secp256k1` key as the default global controller
key for users that want federated identity discovery.
- [ ] Do not require the global Nostr/controller key to sign every artifact.
Use it for identity bindings, resource-resolution records, login challenges,
key delegation, rotation, and revocation; allow separate delegated keys for
device signing, dataset signing, encryption, sessions, and recovery.
- [ ] Make `age-x25519` the default encryption suite for new protected
artifacts.
- [ ] Keep legacy `x25519-hkdf-sha256` as a read-only compatibility suite until
old envelopes no longer need to be opened.
- [ ] Add a shared key capability resolver that chooses an encryption suite from
recipient identity bundles and policy requirements.
- [ ] Keep envelope context binding suite-independent: every suite must preserve
repo, path, purpose, request/message metadata, and recipient checks before
plaintext is accepted.
- [ ] Add tests proving old envelopes still open and new envelopes choose the
preferred suite.
- [ ] Leave room for future suites such as Nostr private-message wrapping,
KMS/HSM-backed keys, local secure enclave keys, and other public-key
encryption formats.
### Provider-Neutral Model
Rho should grow toward provider-neutral core records:
```text
RhoIdentity
id
provider
handle or public key
public signing keys
public encryption keys
provider proofs
revocations
RhoSpace
id
members
policies
storage locators
discovery locators
SignedChangeSet
base_state_digest
author
changed paths
artifact digests
signatures
optional approval metadata
RhoIdentityBinding
rho_id: id/github/madhavajay
controller: nostr:npub1...
resources
revision
relay hints
signatures
RhoAttestation
issuer
rho_id
controller
claim event
status
method
expiry
```
Git commits and PRs are one implementation of `SignedChangeSet`. Object stores,
Dropbox folders, local folders, and peer-to-peer transfer can use the same Rho
artifact semantics with different storage mechanics.
The implementation needs one explicit naming decision before wider API churn:
whether the canonical string form is `id/{provider}/{subject}` with optional
`rho://id/{provider}/{subject}` URI wrapping, or whether `rho://id/...` is the
canonical internal identifier. The current user-facing direction favors the
short `id/...` form.
### Atomic Change Manifests
- [ ] Add signed atomic change manifests as the portable unit of change for Git
and non-Git storage.
- [ ] Commit Git-backed change manifests under a path such as
`rho/changes/<change-id>/change.yaml` plus `change.rhosig.yaml`.
- [ ] Require every upsert file referenced by a change manifest to be present
and match its declared digest.
- [ ] Require deletes to match the previous digest when one is declared.
- [ ] Add `expected_previous` checks to prevent blind overwrites on non-Git
backends.
- [ ] Run policy checks over the changed file list before accepting a change.
- [ ] Require protected files to remain encrypted according to permissions
policy regardless of storage backend.
- [ ] Add deterministic state digests so a resulting Rho space state can be
verified outside Git history.
- [ ] Support exporting a Git commit as a Rho change set and applying a Rho
change set to a non-Git backend.
Example change manifest shape:
```yaml
version: 1
kind: rho_change_set
change:
id: rho://change/rho/chg-...
space_id: rho://repo/github/org/project
base_state:
digest: sha256:...
storage_ref: git:commit:...
author: rho://id/github/alice
created_at: "2026-06-16T00:00:00Z"
message: "Submit request req-prices-total-001"
files:
- path: rho/messages/inbox/id/github/owner/req-001/request.yaml
op: upsert
sha256: "..."
bytes: 1234
media_type: application/x-yaml
encrypted: true
- path: workspace/sum_prices.py
op: upsert
sha256: "..."
bytes: 456
media_type: text/x-python
encrypted: false
- path: old/path.yaml
op: delete
previous_sha256: "..."
expected_previous:
- path: rho/membership.yaml
sha256: "..."
signatures:
- path: rho/changes/chg-...rhosig.yaml
```
For Git, this manifest can be committed alongside the changed files under:
```text
rho/changes/<change-id>/change.yaml
rho/changes/<change-id>/change.rhosig.yaml
```
For object stores or Dropbox-like backends, the apply sequence should be:
1. Upload all file blobs by content hash.
2. Upload the signed change manifest.
3. Atomically advance the space head from `base_state.digest` to the new state
digest using compare-and-swap or the backend's closest equivalent.
4. Readers accept the new head only after all referenced blobs are available and
verified.
### Storage Backends
- [ ] Define a small `StorageBackend` contract for `get`, `put`, `list`,
`head`, `propose`, and `fetch_proposal`.
- [ ] Add a Git storage backend where branches/PRs are proposals and commits
are state transitions.
- [ ] Add a `local-fs` storage backend for tests and local-only workflows.
- [ ] Add object-storage support, preferably S3/R2-style immutable blobs plus
signed head objects with compare-and-swap semantics.
- [ ] Consider Dropbox-style folder sync only after conflict semantics and state
digests are explicit.
- [ ] Treat Nostr as a pointer/announcement backend for small signed records,
not as canonical storage for large artifacts.
- [ ] Treat Iroh as peer transport for content-addressed artifact transfer, not
as policy authority.
Possible storage contract:
```rust
trait StorageBackend {
fn get(path_or_hash) -> bytes;
fn put(path, bytes, expected_previous_digest) -> digest;
fn list(prefix) -> Vec<Entry>;
fn head(space_id) -> StateDigest;
fn propose(change_set) -> ProposalId;
fn fetch_proposal(id) -> SignedChangeSet;
}
```
Backend mapping:
- `git`: branches and PRs are proposals; commits are state transitions.
- `local-fs`: direct file writes; useful for tests and local-only workflows.
- `s3` / `r2`: immutable blobs plus signed head objects; updates use
compare-and-swap style checks.
- `dropbox`: user-friendly folder sync with revision checks; Rho signatures
remain authoritative.
- `nostr`: good for small signed announcements, identity bundles, endpoint
discovery, and proposal pointers; not good for large artifacts.
- `iroh`: good for content-addressed peer-to-peer artifact transfer.
The first non-Git backend should probably be `local-fs` or object storage. Those
will force the cleanest storage abstraction before adding consumer sync systems
like Dropbox.
### Identity Providers
- [ ] Add identity provider modules beyond GitHub instead of extending workflow
commands with provider-specific logic.
- [ ] Add provider-neutral identity and proof structs.
- [ ] Keep external identity providers as proof of control over Rho keys; do not
require them to become Rho signing keys.
- [ ] Add Nostr controller binding where an existing Rho identity such as
`id/github/madhavajay` or `id/email/me@madhavajay.com` is claimed by a Nostr
public key.
- [ ] Add verifier attestations so Rho or another trusted verifier can say it
checked both sides of a binding, for example GitHub OAuth/gist/repo proof or
email OTP plus a Nostr-signed challenge.
- [ ] Support one controller key managing multiple Rho identities, and one Rho
identity having multiple current or delegated controller keys.
- [ ] Add explicit key delegation, rotation, revocation, and recovery records
instead of permanently equating a Rho identity with one Nostr key.
- [ ] Support Nostr controller records with separate Rho encryption
capabilities and delegated keys.
- [ ] Consider `email-domain` proofs through `/.well-known/rho.json` or DNS TXT.
- [ ] Consider `email-otp` only for contact/recovery, not durable authority.
- [ ] Avoid publishing raw email identities to public relays by default. Support
private or blinded email bindings where public correlation is not intended.
- [ ] Consider `did:web` and `did:key` for a standards-aligned future path.
- [ ] Treat X/Twitter-style profile proofs as possible but high platform/API
risk.
Initial Nostr work should bind an existing Rho identity bundle to a Nostr public
key. Do not require replacing GitHub OAuth, SSH signatures, GitHub commit
signing, or existing Rho signatures immediately.
Suggested event vocabulary:
- `kind:30382`: Rho identity/resource record, addressable by `d =
id/{provider}/{subject}`.
- `kind:30383`: Rho verifier attestation, signed by Rho or another verifier.
- `kind:30384`: Rho controller delegation, rotation, or revocation.
- `kind:10002`: standard NIP-65 relay-list event for relay discovery.
Use standard indexed tags where possible, especially `d` for addressable
records and `p` for controller references. Custom tags such as `rho-id` are
useful for readability, but clients should not assume every public relay indexes
multi-character tags efficiently.
Example Nostr identity record:
```yaml
version: 1
identity:
rho_id: id/github/madhavajay
controller: nostr:npub1...
revision: 3
resources:
profile:
uri: https://rho.biovault.net/id/github/madhavajay
inbox:
uri: wss://relay.biovault.net
manifest:
uri: https://rho.biovault.net/.well-known/rho/id/github/madhavajay.json
relays:
- wss://relay.biovault.net
- wss://relay.partner.example
keys:
- kind: signing
algorithm: secp256k1-schnorr
public_key: "..."
- kind: encryption
algorithm: age-x25519
public_key: "..."
```
Example verifier attestation:
```yaml
version: 1
attestation:
rho_id: id/github/madhavajay
controller: nostr:npub1...
issuer: rho.biovault.net
claim_event: "nostr:event:..."
status: verified
method: github-oauth
verified_at: "2026-06-16T00:00:00Z"
expires_at: "2027-06-16T00:00:00Z"
```
### Discovery, Notifications, And Chat
- [ ] Use Nostr relays as Rho's realtime control plane for identity bundle
publication, latest profile pointers, storage locator advertisement, peer
endpoint advertisement, and contact notifications.
- [ ] Implement a Rho directory resolver that queries multiple relays, verifies
event IDs and signatures, selects the newest valid addressable record by
`(kind, pubkey, d-tag)`, checks the Rho revision, and evaluates trusted
attestations.
- [ ] Track relay delivery state for each published Rho record and retry failed
relays without resigning the event.
- [ ] Publish the same signed small Rho records to multiple relays for
redundancy. Do not depend on automatic relay-to-relay federation.
- [ ] Add a repair/replication worker that queries configured relays, verifies
signed records, republishes missing records, and reports whether replica
targets are satisfied.
- [ ] Consider NIP-77 / Negentropy sync later for partner relays, but do not
require it for the first implementation.
- [ ] Add Nostr discovery publish/resolve commands, for example
`rho discovery publish nostr` and
`rho discovery resolve rho://id/nostr/npub...`.
- [ ] Add notifications for GitHub PRs or external backend proposals without
making Nostr authoritative for approvals.
- [ ] Add lightweight Nostr chat for human-to-human, human-to-agent, and
agent-status coordination.
- [ ] Prefer newer private-message paths for private chat where possible.
- [ ] Keep sensitive Rho artifacts inside Rho encrypted envelopes or a verified
storage/transport backend even when Nostr announces them.
- [ ] Add desktop realtime subscriptions so the app can show a live Rho inbox
without polling GitHub.
- [x] Add WebSocket subscriptions for live Rho record updates using Nostr `REQ`
filters, `EVENT` messages, and `EOSE`.
- [ ] Add reconnect handling with timestamp overlap and event-id
deduplication.
- [ ] Do not use Nostr as the canonical data store, the authority for
approvals, the private artifact transport, or the source of truth for policy.
- [ ] Define a small Rho Nostr event vocabulary for contact requests, external
backend proposal notifications, request/proposal/result pointers, and chat.
Suggested event content:
```json
{
"type": "rho.contact-request",
"from": "rho://id/nostr/npub...",
"to": "rho://id/nostr/npub...",
"intent": "github-pr-review",
"repo": "github:madhavajay/project",
"pr": 12,
"rho_request_id": "req-example",
"fallback": {
"iroh_endpoint": "...",
"git_remote": "..."
}
}
```
Suggested live subscription for one Rho identity:
```json
["REQ", "rho-id-github-madhavajay", {
"kinds": [30382, 30383, 30384],
"#d": ["id/github/madhavajay"]
}]
```
Suggested broad subscription for Rho identity updates:
```json
["REQ", "rho-identities", {
"kinds": [30382, 30383, 30384],
"since": 1781600000
}]
```
Nostr chat use cases:
- human-to-human chat
- human-to-agent coordination
- agent status updates
- request review notifications
- "please fetch this Rho proposal" messages
Nostr chat should mostly carry coordination text and pointers.
### Peer Transport
- [ ] Add an Iroh peer transport for direct artifact exchange.
- [ ] Add commands such as `rho peer serve`,
`rho peer connect rho://id/nostr/npub...`, and
`rho sync --transport iroh`.
- [ ] Use Nostr to tell peers how to find each other and Iroh to move bytes.
- [ ] Verify every received object by digest, signature, policy, and revocation
state before accepting it.
Suggested role split:
```text
Nostr:
live notifications
identity discovery
peer reachability
lightweight chat
pointers to Rho artifacts
Rho storage:
git / object store / dropbox / local folder / iroh blobs
signed manifests
encrypted payloads
approval and provenance chain
Iroh:
direct peer transfer
relay-assisted connection setup
larger artifact exchange
```
### Cloudflare Relay
- [x] Continue the `./relay/` Rust Cloudflare Worker relay as the first Nostr
relay/discovery deployment target.
- [x] Prioritize local dev mode before deployment so the relay protocol,
storage behavior, and resolver can be tested without Cloudflare.
- [x] Configure the Cloudflare account id
`a13177c2a05d56bdcc668c431ece4bba`.
- [x] Use `relay.biovault.net` as the first deployed test hostname.
- [ ] Add persistent relay state and WebSocket fan-out, likely with Durable
Objects if the Rust Worker support path is acceptable.
- [ ] Keep the relay deployable without making it a Rho authority.
- [x] Keep NIP-11 relay metadata available over HTTP.
- [x] Add a native local Rust relay server with in-memory event storage,
subscription replay, and live WebSocket fan-out.
- [ ] Start with one Durable Object / one local in-memory relay for small
private-directory use, then decide whether sharding is needed by public key,
event kind, tenant, or time bucket.
### Design Rules For The Next Architecture
- [ ] Rho signatures and policies remain authoritative.
- [ ] Backends are adapters, not policy engines.
- [ ] Nostr controller keys anchor global identity claims but do not replace
provider-specific identities such as GitHub, email, ORCID, or domain proofs.
- [ ] Provider-specific actions can keep native authorship. For example, a GitHub
commit can remain authored by `id/github/madhavajay` while Rho resolution
shows that this provider identity is controlled by `nostr:npub1...`.
- [ ] Git commits remain useful history, but Rho change manifests should define
the portable atomic unit.
- [ ] Discovery records are hints until signatures verify.
- [ ] Storage records are untrusted until Rho validates digest, signature,
policy, and revocation state.
- [ ] Chat messages can coordinate, but protected actions still require typed
Rho requests and signed approvals.
- [ ] Sensitive payloads should stay in Rho encrypted envelopes even when
announced over Nostr.
- [ ] Nostr relay redundancy means publishing the same signed event to multiple
relays and repairing missing copies. Public relays are best-effort unless
there is an explicit partner retention agreement.
### Naming To Resolve
- [ ] Choose provider-neutral names for Git/GitHub-backed ideas before renaming
user-facing commands. Candidate concepts include space, change set, proposal,
storage backend, identity provider, discovery locator, and transport.
## Unimplemented / Deferred
These items are consolidated from `improvements.md` and `improvements2.md`.
They are intentionally unchecked.
- [ ] Add richer tool-specific schemas for inputs, outputs, and runner
capabilities beyond simple argv templates.
- [ ] Add a migration command for older duplicated role/admin state, with
`--dry-run`.
- [ ] Make `rho repo doctor` emit non-fatal migration advisories for legacy
duplicated role/admin state.
- [ ] Migrate older live rehearsal repos that predate committed approval grants,
run receipts, and result provenance artifacts.
- [ ] Continue normal typed-manifest cleanup outside the shared-root trust paths
already converted.
- [ ] Add `rho dataset fetch <name> --variant <public|mock|real>` to materialize
external dataset sources into `.rho/external/datasets/...` on demand.
- [ ] Add dataset interface/schema checks so a private `real` twin can be
verified against the public or mock variant it claims to match.
- [ ] Add per-project dataset binding overrides if profile-global bindings are
not specific enough for a future workflow.
- [ ] Add more external dataset source kinds beyond the first Hugging Face
manifest support, including plain Git/HTTP object sources.
- [ ] Review remaining intentional process calls and either document or replace
them: `curl`, `chmod`, and `python3`.
- [ ] Replace `chmod +x` process execution with `std::fs::set_permissions`.
- [ ] Replace the GitHub profile-proof `curl` call with an HTTP crate or
document `curl` as an intentional integration point.
- [ ] Replace production-path `unwrap()` calls with `RhoResult` errors where
malformed input can currently panic.
- [ ] Add a Rust source-size lint, modeled as a custom integration test wired
into `lint.sh`, that fails production Rust source files over 500 lines so
large modules are refactored before they keep growing.
- [x] Consolidate the split improvement docs into this single `TODO.md`.
## Review Notes
These notes came from the follow-up review that was previously in
`improvements2.md`.
### Verified Claims
The v1 completion claim is defensible as worded: the scope is self-defined and
the implemented items are real, not aspirational.
| CR1: `file_digest` is in-process SHA-256, no `fallback-` path | `lib.rs` streams via `sha2`; tests assert no fallback prefix |
| CR2: `date`/`uuidgen`/`file`/`shasum` shell-outs removed | `now_rfc3339`, `uuid_like`, and `mime_type` are in-process |
| CR4: `path_matches_pattern` is segment-aware glob | `lib.rs` has a recursive matcher, not the old trailing-`*` hack |
| Context binding is enforced before payload use | new age envelopes encrypt and check an inner context; legacy envelopes bind context through HKDF `info` |
| e2e coverage exists for the listed flows | the `rho-*.sh` e2e scripts cover the claimed features |
### Pushback To Keep Visible
- The phrase "100% complete / no blockers" is only accurate for the chosen v1
scope. The unchecked items above remain real post-v1 work.
- Recipient-envelope clean-filter churn is fixed by reusing a decryptable staged
index envelope when the plaintext and context already match. Future filter work
should preserve that property.
- Context binding is sound where high-level commands pass expected context. New
commands should continue to require or derive the expected context before
accepting decrypted envelope payloads.
- Revocation checks are repo-aware; future verification paths should avoid
silently skipping revocation when `--repo-root` is available.
- Remaining process calls should either be deliberate integration points or
removed. Current review targets are `curl`, `chmod`, and `python3`.
### Solid Architecture To Preserve
- Approval authority is bound to an owner-signed in-repo grant; merged PRs are
transport, not authority.
- The auditable provenance chain is:
`request -> code digest -> runner -> input commitment -> output digest -> result`.
- Tar extraction is fail-closed with hostile fixture coverage.
- Context binding is part of the signed payload and is enforced by envelope
decryption checks before plaintext is accepted.