rayfish 0.1.0

P2P mesh VPN powered by iroh — connect peers by cryptographic identity, not IP address
Documentation
# Rayfish

P2P mesh VPN powered by [iroh](https://iroh.computer). Connects peers by cryptographic identity (EndpointId), not IP address. Networks use dual-stack addressing: stable IPv4 in 100.64.0.0/10 (CGNAT, FNV-1a of identity) and stable IPv6 in 200::/7 (blake3 of identity, 120-bit, never rotates).

## Build

```bash
cargo -q build                 # add --features tor for Tor transport, --features otel for OTLP span export
cargo -q check
cargo -q test
cargo -q clippy
cargo bench                    # Criterion microbenchmarks of the per-packet data path (benches/forward.rs)
```

The crate is split into a library (`src/lib.rs`, the daemon's modules as `pub mod`)
and a thin binary (`src/main.rs`, the `ray` CLI/IPC client, `use rayfish::…`). The
split exists so benchmarks (`benches/`) and integration tests can reach the internal
data path; `cargo install` builds the binary against the in-package library unchanged.

## Run

The daemon (`ray daemon`) owns the TUN device and iroh endpoint and runs as a system service. CLI commands talk to it over Unix-socket IPC.

```bash
sudo ray up                    # install+start the service, then activate the VPN
ray create [--open] [--name n] [--hostname h] [--tor]   # closed by default; --open = public network. Prints room id (public key)
ray join <room-id-or-invite> [--name alias] [--hostname h] [--auto-accept-firewall] [--tor]  # join by room id or one-time invite code; --auto-accept-firewall auto-installs suggested rules (managed node/server)
ray leave <net> | nuke <net>   # nuke = publish empty record then leave
ray hostname <net> <name>      # change hostname on existing network
ray status                     # all networks (works without daemon); per-host traffic, member count excludes self
ray <cmd> --json               # global flag: machine-readable JSON for status/firewall show/files/invite list/requests/admin list (color + spinners off)
ray report                     # bundle logs+metrics, open a pre-filled GitHub issue
ray up [--hostname h] | down   # activate / standby (TUN + DNS), daemon stays running; --hostname sets your default name

ray invite <net> [--expires 7d] [--hostname H]       # coordinator-only: mint single-use invite (+QR); --hostname binds an authoritative name (overrides joiner choice, rejected on collision) the holder takes on join
ray invite <net> --reusable [--expires 30d]          # mint a reusable (multi-use, expiring) key for unattended fleets; rides the signed blob, no hostname binding. Servers: ray join <key> --hostname H --auto-accept-firewall
ray invite <net> list|revoke <id>          # list / revoke invites (reusable keys tagged; revoke propagates via the blob)
ray requests <net>             # coordinator-only: peers awaiting live approval
ray accept <net> <id> | deny <net> <id>    # admit / reject a pending join request
ray connect <contact-id> [--hostname h]    # request a direct 2-peer connection by the peer's contact id (no room id/invite); blocks as pending until they approve
ray connections [approve <id>]             # list incoming connect requests (default) / approve one → mints a 2-peer network with the requester pre-approved
ray contact [id|rotate]        # print (default) or rotate your shareable contact id (also shown at the top of `ray status`)
ray admin <net> add <id> | list            # coordinator-only: grant the network key (co-coordinator) / list key-holders
ray firewall show|default|add|remove ...               # per-device local firewall. Default posture: inbound TCP/UDP denied, inbound ICMP allowed, outbound allowed. `firewall default allow|deny` sets the inbound default
ray apply <spec> [--prune] [--dry-run] [--invite-missing] [--example]   # declarative deploy (YAML only): create closed nets + suggest firewall + report membership gap
ray firewall suggest <net> --subject H [--allow peer:proto:ports] [--deny peer:proto:ports]  # coordinator-only: suggest rules on any network (rides the signed blob). Subject/peer `*` = all hosts / any peer. Token grammar: `proto:ports` (tcp:22, udp:53, tcp:*, any:*) or bare proto (icmp, any, tcp). An allow-list ⇒ whitelist (catch-all deny appended); denies-only ⇒ blacklist
ray firewall pending <net> | accept <net> | deny <net>  # review/accept/discard queued suggested rules (manual consent queue). On a TTY, `pending` is an interactive picker (↑↓ move · enter accept · d deny · a all · q done) resolving rules per-record; piped/`--json` falls back to a static table
ray firewall auto-accept <net> on|off  # toggle this node's auto-install of suggested rules for a network (on = install current queue)
ray mdns on|off                # local peer discovery (default on)
ray send <file> <peer>         # file sharing; ray files [accept <id> [--output dir]]
ray pair [<ticket>|backup|restore <code>]              # multi-device identity
ray pair backup [--1password [--vault V] [--item T]]   # encrypted key backup; --1password stores the enc1 blob in 1Password (op CLI)
ray pair restore [<code>|--1password [--vault V] [--item T]]   # restore from a code or from 1Password
ray completions <shell>
ray version | ray --version | ray -V        # print the compiled rayfish version
ray update [--check] [--force]              # self-update to the latest GitHub release; --check reports current vs latest without installing. Replaces this binary, then (if the service is installed) restarts the daemon onto it (needs root)
```

**Privilege & access (Tailscale operator model):** the always-root daemon does privileged work; clients are unprivileged. The IPC socket is mode `0666`; authority comes from a per-request `SO_PEERCRED` UID check in `DaemonState::check_authorized()`, not socket permissions. Reads (`status`, `*… show`, `files`) are open to any local user; mutating commands need root or the configured `operator_uid`; `set-operator` is root-only. Only `install`, `restart`, `uninstall`, `set-operator`, and `daemon` need `sudo`; everything else (incl. `up`/`down`) is IPC. `ray up`/`install` auto-grant operator to `$SUDO_USER`.

```bash
sudo ray install | restart | uninstall      # manage the service unit/plist
sudo ray set-operator <user>                 # authorize a user to run ray without sudo
```

### Cross-compile & deploy

```bash
just cross                     # build for x86_64 Linux
just deploy <ip>               # cross-build release + install + start daemon
just deploy-dev <ip>           # same, debug build
```

## Architecture

```
App → TUN (100.64.x.x / 200::x) → rayfish → iroh QUIC datagrams → peer
```

A single iroh Endpoint and TUN device are shared across all networks. Each network gets its own ALPN (`rayfish/net/<pubkey-prefix>`); the `ProtocolRouter` dispatches incoming connections by ALPN to per-network handlers.

### Modules

- `src/main.rs` — thin clap CLI + IPC client; service install/start (`cmd_up`, `install_and_start_service`), `cmd_install`/`cmd_restart`/`cmd_uninstall_service` (root-gated), `cmd_set_operator`, `cmd_pair`. `ray daemon` (hidden) runs the foreground daemon loop.
- `src/daemon.rs` — daemon process: `DaemonState` (endpoint + TUN + PeerTable + ProtocolRouter), `NetworkHandle` per active network (with per-network `invite_lock`), `NetworkState` (carries access `mode`, `suggested_firewall` from the blob, in-memory `pending` join requests, `pending_suggestions` manual-consent queue), IPC server, accept handling (`CoordinatorAcceptState`/`MemberAcceptState` via `AcceptHandler`), reconnect loop, DHT publisher, group poller, activate/deactivate (up/down), nuke, invite/approval + firewall/file/pairing IPC handlers, `apply_suggested_firewall` (materializes suggested rules at the blob-apply sites), DNS table updates. Coordinator admission gate lives in `CoordinatorAcceptState::handle_connection` → `admit_peer` (open mode / valid invite / pre-approved) vs queue-as-pending (closed). `register_coordinator_handler` (shared by create, restore, and admin-promotion) registers `CoordinatorAcceptState` and flips the stored `NetworkRole` to `Coordinator`; `promote_to_coordinator` swaps a live member to that handler when it receives an `AdminGrant`. On a fresh join the daemon dials in `coordinator_dial_order` (minter first, then the remaining `is_coordinator` members from the verified blob); `gossip_targets` picks live coordinator peers for `InviteShare`/`InviteUsed` broadcasts. The admit path assigns IPv4 via `membership::assign_ip` (lowest free collision index in the roster). The `ray connect` handlers (`connect`/`list_connections`/`approve_connection`/`rotate_contact`) live here too; `ProtocolRouter` holds the `pending_connects`/`approved_connects`/`outgoing_connects` `DashMap`s and the `CONNECT_ALPN` accept arm; `create_network_inner` takes `direct` + `pre_approve` to mint a direct 2-peer network.
- `src/ipc.rs` — `IpcMessage` enum (requests + responses incl. `InviteCreate`/`InviteList`/`InviteRevoke`/`Requests`/`AcceptRequest`/`DenyRequest` + `InviteCreated`/`InviteListResponse`/`PendingRequests`; `Join` carries optional `invite` secret + `coordinator` to dial directly; `ray connect` adds `Connect`/`Connections`/`ApproveConnection`/`ContactId`/`RotateContact` + `ContactIdResponse`, `StatusResponse.contact_id`, and `NetworkRole::Direct` (display-only)), `MsgpackCodec` (length-prefixed msgpack over Unix socket), socket at `/var/run/rayfish/rayfish.sock`.
- `src/identity.rs` — persistent Ed25519 keypair (`~/.config/rayfish/secret_key`); device certs (`create/store/load_device_cert`).
- `src/onepassword.rs` — `op` CLI wrapper for `ray pair backup/restore --1password`: `op_available`/`store` (create-or-update an item, secret blob piped via stdin) / `read`. Transports the existing `enc1…` encrypted backup blob to/from a 1Password item; CLI-side only.
- `src/invite.rs` — coordinator-only **single-use** invite ledger (`~/.config/rayfish/invites/<network>.toml`, written `0600`): `Invite { id, secret_hash, created, expires, status }`, `InviteStore` (`mint`/`redeem`/`revoke`/`list`/`restore`/`record_shared`/`burn_by_hash`, single-use + expiry, blake3-hashed secrets — only the hash is persisted); `redeem` burns a secret at admission, `restore` un-burns it if `admit_peer` then rejects the join (hostname/IP collision) so the holder isn't locked out; `encode_invite_code`/`decode_invite_code` = `bs58(network_pubkey(32) || coordinator(32) || secret(16))`. Never published in the GroupBlob. **Cross-coordinator gossip:** `record_shared(id, secret_hash, expires)` inserts a received `InviteShare` (from the minting coordinator) into the local ledger so this coordinator can validate and burn it; `burn_by_hash(secret_hash)` marks a shared entry used when an `InviteUsed` gossip arrives after another coordinator redeems it. **Reusable keys are not here** — they live in the signed blob (`membership::ReusableKey`); `generate_secret`/`encode_invite_code` are shared by both.
- `src/membership.rs` — `IdentityProvider`, IPv4/IPv6 derivation, `MemberList`/`ApprovedList`, `GroupBlob { members, approved, suggested_firewall, name, reusable_keys }` with canonical msgpack + blake3 hashing (`canonical_group_bytes`/`group_blob_hash` thread `SuggestedFirewall` + `reusable_keys`; BTreeMap keys ⇒ canonical bytes for free); `Member`/`ApprovedEntry` carry optional `user_identity` + `device_cert`, a boolean `is_coordinator` flag (set by `ray admin add`, published in the blob so joiners can discover co-coordinators), and a `collision_index: u32` (serde default 0). `assign_ip(members, identity) -> (Ipv4Addr, u32)` picks the lowest free collision index so the coordinator assigns a per-member `(ip, index)` at admission; `validate_member`/`validate_approved` check the stored IP against `derive_ip_with_index(identity, collision_index)`; `validate_no_duplicate_ips(&[Member])` rejects a roster with duplicate IPs; `resolve_ip_tiebreak(Vec<Member>) -> Vec<Member>` re-seats contested entries in identity order (lowest keeps its index, others re-roll) and is run by reconverge before applying a fetched roster. `ReusableKey { id, created, expires, revoked }` is a multi-use join key keyed by hex `blake3(secret)`; `ReusableKey::from_secret` mints one, `revoke_reusable` flips its `revoked` flag (exact/prefix id), and the pure `validate_reusable_key(keys, secret, now)` is the admission decision (present + not-revoked + not-expired). `SuggestedFirewall`/`HostSuggestions` live in `ray-proto` (`policy.rs`) so they cross IPC, ride in the blob, and parse from a `ray apply` spec uniformly.
- `src/transport.rs` — iroh endpoint setup, per-network ALPN; identity-level `CONNECT_ALPN` (`rayfish/connect/1`) for the `ray connect` handshake; optional Tor transport (`tor` feature). The shared endpoint binds a **fixed UDP port** `RAYFISH_LISTEN_PORT` (41383) instead of an ephemeral one, so the port is stable across restarts and can be manually port-forwarded for guaranteed direct reachability (iroh still does automatic NAT traversal/UPnP/PCP, discovery, and relay fallback on top). If the fixed port is already in use the daemon logs a warning and falls back to an ephemeral port (`0.0.0.0:0`) so it always starts. The port is hard-coded (not per-network — there is one shared endpoint), so a manual forward benefits only one node per LAN; if multi-node-per-LAN manual forwarding is ever needed, make `RAYFISH_LISTEN_PORT` configurable.
- `src/tun.rs` — async dual-stack TUN (IPv4 /10 + IPv6 /128), split into `TunReader`/`TunWriter`; `configure_ipv6()` assigns the TUN's own IPv6 address at creation (Linux netlink via rtnetlink, macOS ifconfig); `route_peer_range()` installs the `200::/7` peer-range route into the TUN and **must run after link-up** (called from `DaemonState::activate()` post-`set_link_up`) — on Linux the kernel won't install an IPv6 connected route while the link is down, so peer traffic would otherwise leak out the host's default IPv6 route (Linux: rtnetlink `RouteMessageBuilder`; macOS: explicit `route add -inet6 -net 200::/7`). Idempotent across `up`/`down` cycles.
- `src/forward.rs` — TUN ↔ peer forwarding via dual-stack routing lookup; firewall enforcement; labeled drop counters; resolves transport keys to user identities via `DeviceUserMap`.
- `src/dht.rs` — one pkarr record per network (blob hash + seed peers); only the coordinator (per-network secret key) can publish. Plus a per-user **contact record** (`_rayfish_contact`, signed by the contact key) mapping `contact_pubkey → current endpoint` for `ray connect` (`publish_contact`/`resolve_contact`); a TTL/2 active-gated publisher (`spawn_contact_publisher`) keeps it fresh.
- `src/control.rs` — length-prefixed msgpack control protocol over QUIC streams (`JoinRequest`, `JoinPending`, Welcome, MemberApproved, MeshHello, BlobUpdated, `AdminGrant`, `InviteShare`, `InviteUsed`, …); `DeviceCert`, `PairMsg`. A fresh joiner sends `JoinRequest { invite_secret, hostname, device_cert }` first; the coordinator replies `Welcome`, `JoinPending`, or `JoinDenied` on the same stream. `AdminGrant` carries the per-network secret key to a member over the authenticated mesh ALPN (coordinator → co-coordinator). `ConnectMsg` (`Request`/`Pending`/`Approved`/`Denied`) is a separate enum for the `ray connect` friend-request handshake over `CONNECT_ALPN`, framed with the generic `send_framed`/`recv_framed` helpers. `InviteShare { id, secret_hash, expires }` is gossiped by the minting coordinator to other coordinators when a single-use invite is minted; `InviteUsed { secret_hash }` is gossiped when one is redeemed — so any coordinator can validate and burn cross-minted invites. Receivers guard both messages: they are ignored if the sending peer is not `is_coordinator` in the verified roster.
- `src/peers.rs` — `PeerTable` (dual v4/v6 DashMaps), `DeviceUserMap`. A peer keeps one virtual IP across every network it joins, so each `PeerEntry` holds a *set* of connections (`network → Connection`); `lookup_v4/v6` return a `PeerRoute` (a deterministically-chosen connection + all shared networks, for union reachability/firewall checks). A multi-homed peer stays reachable while it shares one live connection; `remove_peer_from_network()` drops a single network's route without unrouting the peer, while `remove()` drops it everywhere.
- `src/config.rs` — network config (`~/.config/rayfish/networks.toml`): per-network secret/public key, `my_hostname`, `group_mode` (open/restricted, persisted at create), `auto_accept_firewall` (this node auto-installs suggested rules; per-network, set by `ray join --auto-accept-firewall` / `ray firewall auto-accept`), `admins` (local record of identities granted the network key), `direct` (auto-minted 2-peer `ray connect` network — tags it in `ray status`, suppresses its room id); `AppConfig.operator_uid`, `AppConfig.default_hostname` (personal fallback name set by `ray up --hostname`, used when create/join omit `--hostname`), `AppConfig.contact_secret_key` (rotatable per-user `ray connect` contact key, lazily generated via `contact_secret`/`rotate_contact_secret`).
- `src/apply.rs` — declarative deploy spec for `ray apply`: `DeploySpec { networks: BTreeMap<network, SuggestedFirewall> }` — each network maps **directly** to its firewall subjects (no `firewall:` wrapper). **YAML only** (most readable; `load` rejects non-`.yaml`/`.yml` and parses via the `config` crate's YAML format; note the `config` crate **lowercases keys**, so network/host names must be lowercase), `expected_hosts()` (union of subject + peer hostnames across the spec's networks, **excluding the `*` wildcard**), and the `EXAMPLE_SPEC` (YAML, includes the wildcard Minecraft case) printed by `--example`. The orchestrator itself lives in `main::ipc_apply`.
- `src/firewall.rs` — per-device firewall (direction/proto/port/peer + optional arrival-`network`), `ArcSwap` for lock-free reads, dual-stack packet parsing; `firewall.toml`. **Defaults are direction-aware**: `FirewallConfig.default_inbound` (serde-default `Deny`) + `default_outbound` (serde-default `Allow`), so unsolicited inbound TCP/UDP is denied out of the box. Inbound ICMP-allow is **not** hard-coded: `FirewallConfig::default()` seeds one ordinary, removable `allow in icmp` rule (`default_icmp_rule()`, `origin: Local`) that the first-match scan matches before the deny default — so `ping` works out of the box, the rule shows in `ray firewall show`, and removing it (`ray firewall remove <i>`) makes the deny default cover ICMP too. The optional `network` field (`None` = any, back-compat) scopes a rule to traffic on one network, so a multi-homed host can restrict a peer per-network (e.g. allow `:8080` only from peers reached via `db`). `RuleOrigin` (`Local` | `Network(net)`) records provenance so network reconvergence replaces the `Network(net)` set without touching hand-added `Local` rules. `materialize_suggestions` builds a subject's inbound rules from a network's `SuggestedFirewall` (hostname-keyed; the `*` subject targets every node, merged with the node's own subject; peer hostnames resolved to identities against the blob's member list, with the `*` peer key meaning any peer/`PeerFilter::Any`; each port-spec token is `proto:ports` or a bare proto keyword (`tcp:22`, `udp:53`, `tcp:*`, `icmp`, `any`) — comma-separated tokens expand to one rule each; an allow-list appends a network-scoped catch-all deny ⇒ whitelist, denies-only ⇒ blacklist, empty subject ⇒ open). `SharedFirewall::replace_network_rules` swaps one network's suggested set, leaving `Local` + other networks. `format_firewall_show` tags suggested rules `(suggested by <net>)`.
- `src/dns.rs` — Magic DNS server on `127.0.0.1:53` (A/AAAA/PTR/SOA for `*.ray`); forward `HostnameTable` + `ReverseLookupTable`. `sync_network_hostnames` rebuilds a network's forward+reverse entries from its member roster (used on every roster update so renames/joins/leaves reflect immediately).
- `src/dns_config.rs` — OS DNS config (`DnsConfigurator` trait). macOS: SCDynamicStore. Linux detection chain: systemd-resolved D-Bus → NetworkManager D-Bus → resolvectl → resolvconf → `/etc/resolv.conf`.
- `src/hostname.rs` / `src/network_name.rs` — hostname + local-alias generation and collision resolution (`resolve_collision` appends `-1`, `-2`, … on a clash, e.g. `dario` → `dario-1`).
- `src/stats.rs` — iroh-metrics `ForwardMetrics`/`PeerMetrics`, Prometheus export on `:9090`; `ForwardMetrics::snapshot()` reads counters into a serializable `MetricsSnapshot` for `ray report`.
- **CLI presentation** (dependency-light, all gated on `style::is_enabled()` = TTY + not `NO_COLOR`/`--json`): `src/style.rs` — 256-color ANSI palette + glyphs (`dot_online`/`dot_offline`/`check`/`cross`/`marker`/`latency`); `set_plain(true)` forces everything off (used by `--json`). `src/layout.rs` — ANSI-width-aware borderless column aligner (`Cell`/`columns`, via `unicode-width`); `main::table()` is the shared header+rows helper every list routes through. `src/progress.rs` — `indicatif` spinner factory (stderr, hidden when plain) for slow ops (`join`, service start, file download). `src/picker.rs` — `crossterm` inline (no alt-screen) interactive list for `ray firewall pending`; returns per-rule accept/deny `Resolution`s the CLI sends as `FirewallResolveSuggestions`. Firewall rules cross IPC as `ray_proto::ipc::FirewallRuleView` (pre-stringified, `Eq`/`Hash`) so the CLI renders/serializes and the daemon value-matches queued rules.
- `src/logdir.rs` — daemon log directory (`/var/log/rayfish` on Linux, `/Library/Logs/rayfish` on macOS). The daemon writes rolling daily files there via `tracing-appender` (set up in `main::init_tracing`); `ray report` bundles them.
- `src/shutdown.rs` — SIGINT/SIGTERM via `CancellationToken`. `src/audit.rs` — append-only audit log (`~/.config/rayfish/audit.log`, TSV `timestamp\tevent\tip\tendpoint_id`); `AuditLog` is held by `PeerTable` (`PeerTable::with_audit`, constructed in the daemon), which logs a `connect` on a peer's first connection in a network and a `disconnect` when its last connection in a network is dropped (or the peer is removed for identity rotation). Best-effort: the daemon runs without auditing if the log can't be opened.

### Key flows

- **Create:** generate per-network `SecretKey` → derive addresses → build initial `GroupBlob` → publish blob + signed pkarr record → persist keys + `group_mode` → print public key as the room id. Closed (`Restricted`) by default; `--open` for a public network.
- **Access modes & admission:** the room id (network public key) is a published discovery key, **never** an admission credential. **Open** networks auto-admit any peer that reaches a coordinator. **Closed** networks gate admission three ways: a one-time **invite** (`ray invite` → `bs58(pubkey || coordinator || secret)`, redeemed+burned under `invite_lock`, coordinator-only local state — gossiped to other coordinators via `InviteShare`/`InviteUsed` so any coordinator can redeem a cross-minted invite), a **reusable key** (`ray invite --reusable` → same code grammar, but the hash rides the signed `GroupBlob.reusable_keys` so it is multi-use, expiring, and revocation propagates via the blob; validated with `validate_reusable_key`, admits non-authoritatively — joiner-chosen hostname, suffix on collision), or **live approval** (unknown peer queued in `NetworkState.pending`, surfaced via `ray requests`, admitted with `ray accept`). The admission handler is `CoordinatorAcceptState`; **any node holding the network key** runs it — `register_coordinator_handler` is called at startup for nodes with a persisted network key, and `promote_to_coordinator` swaps to it at runtime on `AdminGrant`. The admitting coordinator also assigns the joiner's IPv4 via `assign_ip` (lowest free collision index in the live roster).
- **Join handshake:** resolve pkarr record → fetch + verify `GroupBlob` → dial in `coordinator_dial_order` (invite-pinned minter first, then the remaining `is_coordinator` members from the verified blob, skipping self) until one replies `Welcome` → send `JoinRequest { invite_secret? }` first → coordinator replies `Welcome` (admitted), `JoinPending` (closed, awaiting `ray accept` — the joiner retries with backoff on the *same* coordinator until welcomed; `JoinPending` is not a fallback trigger), or `JoinDenied`. The secret is matched first against the local single-use ledger, then against the verified blob's `reusable_keys`; a single-use match burns, a reusable match does not. `ray join <reusable-key> --hostname H --auto-accept-firewall` is the unattended-server path. Then connect to other members with `MeshHello` and poll pkarr for blob updates. Reconnecting/restoring members use the legacy coordinator-speaks-first handshake (`initial = false`).
- **Gatekeeper:** any coordinator (any node holding the network key) can approve identities and broadcast `MemberApproved`; once approved, any peer can welcome that identity. This means admission of a fresh joiner survives any single coordinator being offline — the joiner dials across the full coordinator set. The coordinator need not be online for *member* reconnects at all.
- **DHT (single-record):** one pkarr record per network signed by the per-network secret key. The pkarr address *is* the network public key, so records can't be spoofed (MITM-resistant). `spawn_group_poller()` refetches the blob every 60s when the hash changes.
- **Reachability model (segmentation-first):** a network is a reachability boundary — two peers can exchange packets iff they share ≥1 network (a QUIC connection only exists within a shared network, so this is enforced by connection existence). Coarse access is the network split itself; the per-device firewall is the fine-grained layer (directional, port-, and network-scoped). Declarative provisioning of networks + suggested firewalls is `ray apply` (Phase B).
- **Firewall (local + coordinator suggestions):** the firewall is per-device, first-match-wins, persisted in `firewall.toml`, with a stateful conntrack so return traffic for outbound flows passes under a deny default. **Default posture is secure-by-default for inbound:** with no user rules, inbound TCP/UDP is **denied** (no listening port is exposed when you join an open/public network), inbound ICMP is **allowed** (so `ping`/reachability works out of the box), and outbound (any proto) is **allowed** (you initiate freely; conntrack lets return traffic back in). The config splits the old single `default_action` into `default_inbound` (serde-default `Deny`) and `default_outbound` (serde-default `Allow`). Inbound ICMP-allow rides as a seeded, removable `allow in icmp` rule in `FirewallConfig::default()` (not a hard-coded special case) — it's a visible first-match rule, so deleting it makes the deny default cover ICMP. `ray firewall add` **inserts at the front** (newest wins under first-match) and **merges by selector** (drops any existing rule with the same direction/proto/port/peer/network via `firewall::same_selector`, ignoring action), so `deny in icmp` after the seed makes deny prevail and toggling allow↔deny never accumulates dead rules. This applies to **all installs on upgrade** — an older `firewall.toml` missing the new fields deserializes straight into the secure posture (no back-compat carve-out; the seeded ICMP rule only ships with a fresh config, so an install that already has a `firewall.toml` keeps exactly its own rules). `ray firewall default allow` flips inbound TCP/UDP back to permissive (old behavior); `ray firewall default deny` is the explicit secure default; neither touches the outbound default. On **any** network the coordinator (any network-key holder) can **suggest firewall rules** to nodes — suggestions are advisory and decoupled from any "trusted" flag (which no longer exists). The suggestions ride in the signed `GroupBlob` (keyed by subject hostname, authored before peers exist; the `*` subject targets every node), and each node **materializes** the rules targeting its own hostname (+ the `*` subject) — resolving peer hostnames → identities from the same blob's member list (the `*` peer key = any peer), expanding each `proto:ports` token (e.g. `tcp:22`, `icmp`, `tcp:*`, `any`), and appending a network-scoped catch-all deny when there's an allow-list (whitelist mode; denies-only is blacklist; empty subject is open). Consent is a **per-node, per-network** choice: a node either **auto-accepts** suggestions (`ray join --auto-accept-firewall`, or `ray firewall auto-accept <net> on`, persisted as `config.auto_accept_firewall`: installs via `replace_network_rules` + saves) or queues them for manual `ray firewall accept|deny` (`pending_suggestions`). Hostname authority (so an auto-accepted "allow from alice" resolves to the real alice) comes from **invite binding**, not a network flag. Rules are re-materialized on every verified reconverge — driven by the 60s group poller, or by a **payload-free** `BlobUpdated`/`MemberSync` *trigger* that makes the node reconverge from the network-key-signed pkarr record (`reconverge_and_apply`/`fetch_verified_blob`), never from peer-supplied data; `Local` rules are never touched by reconvergence. Trust model: suggestions are consumed only from the verified blob (network-key-signed pkarr record → hash → blob → rules), never from a peer control message — control messages are triggers only, carrying no roster/firewall payload. Paired devices resolve to one user identity via `DeviceUserMap`.
- **Multiple admins = shared network key.** An admin is any machine holding the per-network secret; `ray admin add <net> <id>` grants the key to a member over the network's authenticated mesh ALPN (`AdminGrant`), making it a co-coordinator that can publish the signed blob, suggest firewall rules, and **admit fresh joiners**. The granting coordinator also sets `is_coordinator = true` on the grantee in the roster and republishes so the full coordinator set is visible in the signed blob — joiners use this to discover co-coordinators for dial-fallback. The grantee persists the key and, on receiving `AdminGrant`, calls `promote_to_coordinator` to swap its accept handler from `MemberAcceptState` to `CoordinatorAcceptState`. `ray admin list <net>` shows the local node + granted identities (local record; the shared key is not attributable).
- **Declarative apply (`ray apply`):** reconcile networks against a spec (`ray apply deploy.yaml`) — **YAML only** (`load` rejects non-`.yaml`/`.yml`). A spec is a `networks:` map of `<name> → SuggestedFirewall` (each network maps directly to its firewall subjects; the `*` subject/peer expresses "all hosts"/"any peer"). The orchestrator (`main::ipc_apply`) fetches `Status` once, then for each spec network: `Create` (a closed network) if absent (never joins), then publishes the network's firewall block as suggestions (idempotent — always replaces the live set). `--prune` publishes exactly the spec's subjects, dropping out-of-band suggestions for hosts no longer mentioned; without it, spec subjects merge over the live set (so `apply` never silently drops a subject). `--dry-run` echoes the normalized spec as YAML; `--example` prints a template. **Membership diff:** expected hosts = union of subject + peer hostnames across the spec's networks (excluding the `*` wildcard); joined hosts = this node + peer hostnames from `Status`. The gap is reported as `ray invite <net> --hostname <missing>` commands; `--invite-missing` mints the one-time hostname-bound invites via IPC. Because an invite-bound hostname is authoritative (overrides the joiner's `--hostname`, collisions rejected), the firewall subject/peer hostnames in the spec are exactly the names the admitted nodes carry — so suggestions always resolve the peers they name. No lock file; the live signed blob is state.
- **Direct connections (`ray connect`):** a friend-request flow for linking two peers with no shared room id or invite code. Each node has a standing, **rotatable contact key** (`AppConfig.contact_secret_key`, distinct from the transport identity and per-network keys), published to pkarr while active (`dht::publish_contact`, `_rayfish_contact` record = `contact_pubkey → current endpoint`) and advertised over `CONNECT_ALPN` (`rayfish/connect/1`). `ray connect <contact-id>` resolves the contact id → endpoint, dials `CONNECT_ALPN`, and sends `ConnectMsg::Request{from_contact_id, from_endpoint, hostname}`; the recipient queues it (`ProtocolRouter.pending_connects`) and replies `Pending`, so the initiator polls with backoff (`spawn_connect_retry`). `ray connections approve <id>` mints a fresh 2-peer network via `create_network_inner(.., direct=true, pre_approve=Some((peer, hostname)))` — restricted mode, auto-named `me-peer`, with the requester already in the `ApprovedList` so the published blob carries the approval. The minter records `(room_id, coordinator)` in `approved_connects`; the initiator's next poll gets `ConnectMsg::Approved` and joins the network normally (pre-approved → `Welcome`), then flags it `direct` in config (`join_direct`). A direct network is a real network — firewall, DNS, mesh all apply — but `ray status` shows it as role `[direct]` (`NetworkRole::Direct`, display-only) and hides the non-shareable room id. **Edge cases:** an offline recipient yields a clean "contact offline" error (publisher is active-gated); maps are keyed by transport endpoint id so they survive contact-key rotation (old contact id stops resolving once its 300s record TTLs out); duplicate requests are idempotent; and if both peers connect *and* approve each other simultaneously, only the higher `endpoint.id()` mints (the lower defers via `outgoing_connects`), so exactly one network forms.
- **File sharing:** `ray send` adds the file to iroh-blobs and sends a `FileOffer` over `FILES_ALPN`; receiver queues it; `ray files accept` fetches the blob by hash and verifies it.
- **Pairing:** primary issues a ticket (`bs58(endpoint_id || secret)`) over `PAIR_ALPN`; secondary authenticates and receives a `DeviceCert` binding its transport key to the primary's user identity. Backup/restore encrypts the identity key (argon2 + chacha20poly1305) into an `enc1…` base58 blob (`make_backup_blob`). `--1password` (alias `--op`) on `ray pair backup`/`restore` transports that blob to/from a 1Password item (default title `Rayfish Identity`, optional `--vault`) via the `op` CLI (`src/onepassword.rs`, create-or-update, secret piped via stdin not argv). 1Password is transport only — the blob stays password-encrypted, so a vault compromise alone can't unlock the key. All `op` calls are CLI-side in the user's context, never from the root daemon.
- **Hostname change:** `ray hostname` propagates immediately and is coordinator-authoritative. The coordinator keeps a continuous per-member control reader (`spawn_coordinator_control_reader`) on each member connection; a member's rename re-sends `MeshHello` over the existing connection, the coordinator resolves collisions (`name`/`name-1`/…), updates the roster + DNS, republishes the blob, then broadcasts a payload-free `MemberSync` *trigger*. The member applies its requested name optimistically and is corrected when it reconverges the roster from the signed record (on the `MemberSync` trigger, or the 60s poller). The coordinator renaming itself runs the same republish+broadcast directly. Receivers rebuild their DNS from the roster on every verified reconverge (triggered by `MemberSync`/`BlobUpdated` or the poller) via `apply_roster_to_dns` → `dns::sync_network_hostnames` (the roster is the single source of truth for `*.ray`), which also clears stale names for departed peers. Hostname authority at admission follows the **invite binding** (not a network flag): a join carrying an invite-bound hostname (`ray invite --hostname`) is assigned that exact name, and a clash with a different identity is rejected — no silent rename — so no admitted peer can claim another's name to take its suggested firewall rules (`hostname::admission_hostname`). A joiner-chosen (free) hostname keeps collision resolution (`name` → `name-1`).
- **Reconnection:** per-peer reader detects drop → coordinator removes the dead peer; joiner reconnects with exponential backoff (1s–30s) then re-sends `MeshHello`.
- **Leave:** `ray leave` gracefully closes its connections with `forward::LEAVE_CODE` before local teardown. Peers see `DisconnectEvent.intentional = true`: the coordinator prunes the member from the roster, republishes the blob, then broadcasts a payload-free `MemberSync` trigger so other members reconverge from the (already-republished) signed record and drop it immediately; the 60s group poller is the backstop. A plain timeout/reset is *not* intentional, so an offline (but not departed) peer stays a known member.
- **up/down:** the daemon (endpoint, IPC, blob store, metrics) is always-on; the active VPN state (TUN up + system DNS + connected networks) is toggled by `activate()`/`deactivate()` tracked in `DaemonState.active`.
- **Report:** `ray report` → daemon `build_report()` gathers sysinfo + a `ForwardMetrics::snapshot()` + the *sanitized* `StatusResponse` (no secret keys) + recent log files, writes a `.tgz` to `/tmp`, and chowns it to the calling UID. The CLI prints the path and opens a pre-filled GitHub issue (`REPORT_REPO_URL`) for the user to attach the bundle. The bundle is local-first, so the user reviews it before sharing; a managed upload service can later replace the GitHub step.
- **Self-update (`ray update`):** queries the GitHub releases API (`rayfish/rayfish`, the same repo the `install.sh` bootstrap installer pulls from) for the latest tag, compares it to `CARGO_PKG_VERSION` with `semver` (`version_is_newer`, only upgrades when strictly newer unless `--force`), maps the host OS/arch to the published asset (`release_asset_name` → `ray-{os}-{arch}`), downloads the binary + its `.sha256`, **verifies SHA-256** before touching anything, then atomically swaps the running binary via the `self-replace` crate. If the system service is installed it re-runs `ensure_service_installed` and `restart_service_and_wait` (shared with `ray restart`) so the daemon comes back on the new binary. Needs root when the service is installed or the binary's directory isn't user-writable (`require_root`); `--check` and `ray version`/`--version` need no root. The raw release binaries are not archived, so no tar/gzip on this path.
- **Tor (optional):** `--tor` adds `TorCustomTransport` alongside relay; onion address derived from the iroh `SecretKey`. Needs a Tor daemon (`ControlPort 9051`).

## Conventions

- Use `cargo -q` for all cargo commands; `tracing` for logging (INFO default, `RUST_LOG` to override). The daemon also writes rolling daily log files under `src/logdir::log_dir()` (console output is unchanged for CLI commands). `main::init_tracing` composes the layers (console + file + optional OTLP) and returns a `LogGuard` that must stay alive for the process.
- Tracing carries spans, not just flat events: network lifecycle handlers (`create/join/leave/nuke_network`) use `#[tracing::instrument]`, and the per-peer reader (`forward::spawn_peer_reader`) + reconnect loop wrap their tasks in `info_span!("peer"/"reconnect", net=…, peer=…)` so report-bundle logs are correlatable per peer/network.
- `otel` feature (off by default): adds a `tracing-opentelemetry` layer exporting spans over OTLP/HTTP. Activated at runtime only when `OTEL_EXPORTER_OTLP_ENDPOINT` (or `..._TRACES_ENDPOINT`) is set; the provider is flushed on shutdown via `LogGuard::drop`.
- Panics are fail-fast in the daemon: `main::install_panic_hook` (set only for `ray daemon`) records the panic via `tracing::error!` and synchronously appends it to `panic.log` in the log dir, then calls `std::process::abort()`. The service unit restarts it (`Restart=on-failure` / launchd `KeepAlive`); `panic.log` is bundled by `ray report` (and flags the issue title/body when present). A live-but-broken daemon would not trip the restart, so we crash cleanly rather than limp.
- Never share I/O resources (TUN, sockets, streams) behind a Mutex — split into read/write halves. Avoid Mutex generally: prefer channels, atomics, or `RwLock`/`ArcSwap` for fast non-async state.
- ALPN per network: `rayfish/net/<pubkey-prefix>` (first 16 hex chars). File ALPN `rayfish/files/1`, pairing ALPN `rayfish/pair/1`.
- TUN MTU 1280 (IPv6 minimum link MTU, RFC 8200 §5; matches WireGuard/Tailscale). Wire format (control + IPC): 4-byte BE length + msgpack body.
- Room id = per-network public key string (discovery only). On a closed network, joining needs a one-time invite or operator approval; on an open network the room id alone admits. Invite code = `bs58(pubkey || coordinator || secret)`. Local aliases (adjective-noun-noun) are display-only.
- Config under `~/.config/rayfish/`: `secret_key`, `device_cert`, `networks.toml`, `firewall.toml`, `invites/<network>.toml` (coordinator-only).
- Always update docs (CLAUDE.md, README.md) after finishing a feature or significant change.