saorsa-core 0.5.0

Saorsa - Core P2P networking library with DHT, QUIC transport, and four-word addresses
Documentation
# Saorsa Core Agents API (AGENTS_API.md)

Copyright (C) 2024 Saorsa Labs Limited — Licensed under AGPL-3.0-or-later.

This document is written for LLM agents and autonomous clients to interact with Saorsa Core. It describes the stable agent-facing API surface, object models, addressing, and example flows for building large-scale peer-to-peer applications with identity, storage, messaging, real-time media, and groups.

Status: evolving, backwards-compatible where possible. All production calls are panic-free and authenticated by design. Examples use Rust-style signatures and JSON payloads, but the protocol is language-agnostic.


## Principles

- Trust-minimized: Identities, groups, and data are addressed by four-word identifiers hashed to 32-byte keys.
- Zero-panic production: No `unwrap`/`expect`/`panic!` in production code. Errors are explicit and typed.
- End-to-end encryption: ML-DSA for identity auth; MLS and PQC symmetric crypto for content.
- Two-tier storage: DHT distribution with FEC-sealed containers; optional local member disks and friend-mesh repair.
- Human-verifiable addressing: Four-word addressing prevents lookalike/phishing through a constrained dictionary and checksum.
- Structured telemetry and tracing: All subsystems emit structured events for observability.


## Addressing and Keys

- Four-Word Address: `[Word; 4]` (e.g., "river-spark-honest-lion"). Validated by the Four Word Networking (FWN) dictionary and encoding rules.
- Identity Key (`Key`): `blake3(utf8(join(words,'-')))` → 32 bytes.
- Context Keys: `compute_key(context: &str, content: &[u8])` → 32 bytes. Used for derived records such as device sets, manifests, websites, and virtual disks.
- Network Endpoints: Public IP endpoints can be represented as four-word strings for display (FW4/FW6 encodings).


## Object Model (Core Records)

- IdentityPacketV1
  - `v: u8`
  - `words: [String; 4]` — user-chosen four words
  - `id: Key` — blake3(utf8(words))
  - `pk: Vec<u8>` — ML-DSA public key
  - `sig: Vec<u8>` — ML-DSA signature over `utf8(words)`
  - `endpoints: Vec<NetworkEndpoint>` — optional public reachability
  - `ep_sig: Option<Vec<u8>>` — signature over `(id || pk || CBOR(endpoints))`
  - `website_root: Option<Key>` — public website root object key (see Website Disk)
  - `device_set_root: Key` — device forwards CRDT root (derived)

- DeviceSetV1
  - OR-Set of `Forward { proto, addr, meta }` entries for the identity’s devices and forwards

- GroupIdentityPacketV1
  - `v: u8`, `words: [String;4]`, `id: Key` — group identifier
  - `group_pk: Vec<u8>`, `group_sig: Vec<u8>` — ML-DSA of the group
  - `members: Vec<MemberRef { member_id: Key, member_pk: Vec<u8> }>`
  - `membership_root: Key` — Merkle root of sorted member IDs
  - `created_at: u64`, `mls_ciphersuite: Option<u16>`

- GroupPacketV1
  - Current group epoch state and container bindings: `membership`, `forwards_root`, `container_root`

- GroupForwardsV1
  - `endpoints: Vec<GroupEndpoint { member_pub, forward, ts }>` per member

- ContainerManifestV1
  - `object: Key`, `fec: { k, m, shard_size }`, `assets: Vec<Key>`, `sealed_meta: Option<Key>`
  - Represents a sealed container (object root) with FEC parameters and asset references


## Identity API

Purpose: Claim and fetch identities, publish reachability and device forwards, and subscribe to changes.

- identity_claim(words: [Word; 4], pubkey: PubKey, sig: Sig) -> Result<()>
  - Validates words (FWN), verifies ML-DSA signature over `utf8(words)` with `pubkey`, computes `id` and `device_set_root`, then stores IdentityPacketV1 under `id` in the DHT with quorum policy.
  - Errors: invalid words, invalid pubkey/sig, DHT policy violations.

- identity_fetch(key: Key) -> Result<IdentityPacketV1>
  - Fetches identity packet by `id` key from the DHT.

- identity_publish_endpoints_signed(id_key: Key, endpoints: Vec<NetworkEndpoint>, ep_sig: Sig) -> Result<()>
  - Verifies `ep_sig` over `(id || pk || CBOR(endpoints))` using the stored identity `pk` and updates the packet.
  - Auto-computes FW4/FW6 display forms when possible.

- device_publish_forward(id_key: Key, fwd: Forward) -> Result<()>
  - Adds or updates a forward in `DeviceSetV1` under `device_set_root`. Uses delegated write auth; emits events.

- device_publish_forward_signed(id_key: Key, fwd: Forward, sig: Sig) -> Result<()>
  - Same as above with explicit signature material for authorization contexts that require proof on canonical content.

- device_subscribe(id_key: Key) -> Subscription<Forward>
  - Subscribes to forward updates for an identity.

Notes
- `WriteAuth` enforces signatures at `dht_put` time. For identity and group canonical records, signatures are verified against canonical bytes to prevent malleability.
- `NetworkEndpoint { ipv4, ipv6, fw4, fw6 }` supports both raw and four-word display addressing.


## Group API

Purpose: Create and publish group identities, maintain group state, endpoints, and storage containers.

- group_identity_create(words: [Word; 4], members: Vec<MemberRef>) -> Result<(GroupIdentityPacketV1, GroupKeyPair)>
  - Derives `id` from words, computes `membership_root` from sorted `member_id`s, generates ML-DSA group keypair, signs canonical message `(id || membership_root)`.

- group_identity_publish(packet: GroupIdentityPacketV1) -> Result<()>
  - Validates words/id and signature; stores canonical packet at `id` key.

- group_identity_fetch(id_key: Key) -> Result<GroupIdentityPacketV1>
  - Reads group identity by id key.

- group_forwards_put(fwd: &GroupForwardsV1, group_id: &[u8], policy: &PutPolicy) -> Result<PutReceipt>
  - Stores under `blake3("group-fwd" || group_id)`; used for up-to-date group forwarding meta.

- group_forwards_fetch(group_id: &[u8]) -> Result<GroupForwardsV1>

- group_put(pkt: &GroupPacketV1, policy: &PutPolicy) -> Result<PutReceipt>
  - Stores current group epoch/control record under `blake3("group" || group_id)`.

- group_fetch(group_id: &[u8]) -> Result<GroupPacketV1>

Conventions
- Group storage containers are referenced via `GroupPacketV1.container_root` and described by `ContainerManifestV1`.
- For MLS, `mls_ciphersuite` and `proof` fields are carried as opaque bytes for verification by MLS-capable clients.


## DHT API

Purpose: Authenticated, policy-driven reads/writes to the Trust-Weighted Kademlia (TwDHT).

- dht_put(key: Key, bytes: Bytes, policy: &PutPolicy) -> Result<PutReceipt>
  - `PutPolicy { quorum: usize, ttl: Option<Duration>, auth: Box<dyn WriteAuth> }`
  - Performs record-type aware verification (Identity/Group canonical checks, DeviceSet CRDT semantics, etc.), publishes change events, and records telemetry.
  - PutReceipt: `{ key, timestamp, storing_nodes: Vec<Vec<u8>> }`.

- dht_get(key: Key, quorum: usize) -> Result<Bytes>
  - Fetches with quorum and records telemetry.

- dht_watch(key: Key) -> Subscription<Bytes>
  - Event stream of updates for a DHT key.

- set_dht_instance(dht: Arc<TwDht>) -> bool
  - Installs a process-global DHT for the API.

Security
- Identity and Group records enforce word validity, id/key matching, and ML-DSA signature verification on canonical bytes.
- Endpoint updates require explicit `ep_sig` independent from base identity `sig`.


## Routing & Trust API

- record_interaction(peer: Vec<u8>, outcome: Outcome) -> Result<()>
  - Updates EigenTrust with interaction outcome and records telemetry.
  - `Outcome = { Ok, Timeout, BadData, Refused }`.

- eigen_trust_epoch() -> Result<()>
  - Triggers a maintenance tick; used for scheduling trust recomputations.

- route_next_hop(target: Vec<u8>) -> Option<Contact>
  - Returns a trust-weighted best next-hop `Contact { node_id, endpoint }` for target routing (simplified selection combining XOR-distance and trust).


## Transport API (QUIC)

Provides low-level primitives for direct connections and typed streams. WebRTC bridging is available via messaging modules for real-time A/V.

- quic_connect(ep: &Endpoint { address: String }) -> Result<Conn { peer: Vec<u8> }>
  - Creates/initializes a P2P node if needed and connects to `address`.

- quic_open(conn: &Conn, class: StreamClass) -> Result<Stream { id: u64, class: StreamClass }>
  - `StreamClass = { Control, Mls, File, Media }` (telemetry-tagged).

Media Notes
- Real-time audio/video/screen-share flows are supported via WebRTC over QUIC with signaling handled by the messaging/webrtc bridge. Agents generally don’t open raw media streams; they invoke call flows in the Messaging API (see below) which use these transport primitives under the hood.


## Storage Control API

High-level placement and maintenance helpers for FEC-sealed content.

- place_shards(object_id: [u8; 32], count: usize) -> Vec<Vec<u8>>
  - Returns node IDs for shard placement using trust-weighted proximity.

- provider_advertise_space(free: u64, total: u64)
  - Publishes capacity to the DHT under a well-known key pattern.

- repair_request(object_id: [u8; 32]) -> RepairPlan
  - Returns `RepairPlan { object_id, missing_shards: Vec<usize>, repair_nodes: Vec<Vec<u8>> }`.


## Friend Mesh Backup API

Optional cooperative backup among friends/devices with rotation.

- friend_mesh_plan(data_size: u64, mesh_config: &FriendMeshConfig) -> FriendBackupPlan
  - `FriendMeshConfig { mesh_id: Key, members: Vec<FriendMeshMember>, replication_factor, rotation_schedule }`
  - Returns `FriendBackupPlan { total_shards, shard_size, assignments: Vec<FriendBackupAssignment> }`.


## Virtual Disks and Websites

Every entity (individual, group, organization, channel) can expose two logical disks:

1) Private Disk (entity-scoped, group-encrypted)
   - Root Key: `disk_root = compute_key("disk", entity_id.as_bytes())`
   - Organization: content addressed by path → object key mapping using `ContainerManifestV1` for each root object.
   - Encryption: MLS or group ML-DSA derived symmetric keys; objects are sealed and sharded with FEC `(k,m,shard_size)` across DHT, with optional member-local caches.
   - Access: membership governed; members reconstruct via DHT + local caches.

2) Website Disk (public)
   - Root Key: `website_root` in `IdentityPacketV1` or `compute_key("website", entity_id.as_bytes())` if not set.
   - Convention: `home.md` is the entry file (Markdown-only web). Assets referenced by relative paths resolve to `assets/` keyed objects in the same container or sibling keys.
   - Publishing: write manifests and assets to DHT under context keys, set/refresh `website_root` on identity.
   - Browsing: agents fetch `IdentityPacketV1`, read `website_root`, fetch `ContainerManifestV1` at root, then fetch `home.md` and linked assets.

Addressing Examples
- Individual disk: `disk_root = blake3("disk" || ID)` where `ID = blake3(words)`.
- Group disk: same construction with the group’s `id` from `GroupIdentityPacketV1`.
- Channel disk: derive channel key: `channel_id = compute_key("channel", group_id.as_bytes())` then `disk_root = compute_key("disk", channel_id.as_bytes())`.


## Messaging API (High-Level Service)

The MessagingService coordinates storage, transport, and encryption for direct and group messaging and calls.

Types
- `FourWordAddress` — identity handle (string form of four words)
- `ChannelId`, `ThreadId`, `MessageId` — logical identifiers
- `MessageContent` — rich content (text, reactions, attachments)
- `EncryptedMessage`, `RichMessage`, `DeliveryReceipt`, `DeliveryStatus`

Construction
- new(identity: FourWordAddress, dht_client: DhtClient) -> Result<MessagingService>
  - Wires persistence, PQC key-exchange, transport, and events.

Send & Receive
- send_message(recipients: Vec<FourWordAddress>, content: MessageContent, channel_id: ChannelId, options: SendOptions) -> Result<(MessageId, DeliveryReceipt)>
  - Encrypts per recipient and sends via transport (DHT + direct), stores locally first.

- subscribe_messages(channel_filter: Option<ChannelId>) -> Receiver<ReceivedMessage>
  - Async stream of inbound messages (decrypted and persisted before delivery).

- get_message_status(message_id: MessageId) -> Result<DeliveryStatus>
- get_message(message_id: MessageId) -> Result<RichMessage>
- mark_user_online(user: FourWordAddress) -> Result<()>
- mark_delivered(message_id: MessageId, recipient: FourWordAddress) -> Result<()>
- process_message_queue() -> Result<()>
- encrypt_message(recipient, channel_id, content) -> Result<EncryptedMessage>
- decrypt_message(encrypted) -> Result<RichMessage>

Realtime A/V Calls
- The messaging/webrtc bridge supports:
  - Direct 1:1 audio/video/screen-share over WebRTC with QUIC transport.
  - Group calls by creating an MLS-secured session; members receive dynamic group endpoints from `GroupForwardsV1`.
  - Signaling exchanged as messages; media flows are direct peer-to-peer when possible.


## Example: Building “Communitas”

Communitas is a large-scale, P2P collaboration app blending WhatsApp (messaging/calls), Dropbox (storage/sync), Slack (channels/threads), and a new Markdown-based web. It is phishing-resistant (four-word networking) and AI-friendly (structured APIs and explicit semantics).

Core Features
- Identity: users and groups claim four-word addresses; groups carry membership roots and published forwards.
- Messaging: 1:1 and group threads with reactions, attachments, and threads.
- Calls: direct audio/video/screen-share with group calls via MLS sessions.
- Storage: end-to-end encrypted “virtual disks” per entity (user/group/channel) with DHT + local caches and FEC-sealed containers.
- Web: every entity can publish a public Markdown site (`home.md`) at its website disk root.

High-Level Flows
1) User Onboarding
   - Generate ML-DSA keypair; select four words; call `identity_claim()` with signature over words.
   - Publish device forwards (`device_publish_forward_signed`) and endpoints (`identity_publish_endpoints_signed`).

2) Creating a Group
   - Select group words; gather initial members (`MemberRef { member_id, member_pk }`).
   - Call `group_identity_create()` and then `group_identity_publish()`.
   - Store `GroupForwardsV1` to advertise callable members.

3) Private Disk Setup (Group)
   - Compute `disk_root = compute_key("disk", group_id.as_bytes())`.
   - Create a root `ContainerManifestV1` with FEC parameters and encrypted `sealed_meta` (MLS key).
   - Place shards with `place_shards(object_id, k+m)` and upload to DHT; optionally seed local caches.

4) Public Website
   - Compute or fetch `website_root`; create a container with `home.md` and assets.
   - Write container manifest under `compute_key("manifest", object_root)` and set `website_root` in identity.
   - Resolution: clients fetch identity → `website_root` → manifest → fetch `home.md` and linked assets.

5) Messaging and Calls
   - For 1:1: create or join `ChannelId` (derived from pair’s IDs), call `send_message()` and listen on `subscribe_messages()`.
   - For groups: derive `ChannelId` from group id; agents retrieve `GroupForwardsV1` to locate members and initiate MLS-secured calls; media flows over WebRTC/QUIC.

6) Friend Mesh Backup (Optional)
   - Maintain resilient copies by planning with `friend_mesh_plan()`; rotate shard assignments on schedule.

Sample Pseudocode (Agent)
```rust
// 1) Claim identity
let words = ["river".into(), "spark".into(), "honest".into(), "lion".into()];
let (pk, sk) = mldsa_generate();
let sig = mldsa_sign(&sk, words.join("-"));
identity_claim(words.clone(), PubKey::new(pk.clone()), Sig::new(sig)).await?;

// 2) Publish endpoints and forwards
let endpoints = vec![NetworkEndpoint { ipv4: Some(("203.0.113.10".into(), 443)), ipv6: None, fw4: None, fw6: None }];
let ep_sig = sign_endpoints(&sk, &id_key, &pk, &endpoints);
identity_publish_endpoints_signed(id_key.clone(), endpoints, ep_sig).await?;
device_publish_forward_signed(id_key.clone(), Forward::quic("203.0.113.10:443"), delegated_sig).await?;

// 3) Create group and publish forwards
let (g_pkt, g_kp) = group_identity_create(group_words, members)?;
group_identity_publish(g_pkt.clone()).await?;
group_forwards_put(&GroupForwardsV1 { v: 1, endpoints: endpoints_for_members, proof: None }, &g_pkt.id.as_bytes()[..], &policy).await?;

// 4) Store website container
let manifest = ContainerManifestV1 { v: 1, object: website_root, fec: FecParams { k: 8, m: 4, shard_size: 65536 }, assets: vec![home_md_key, css_key], sealed_meta: None };
container_manifest_put(&manifest, &policy).await?;

// 5) Messaging
let svc = MessagingService::new(my_fw_address, dht).await?;
let (_id, receipt) = svc.send_message(vec![peer_fw], MessageContent::Text("Hi".into()), channel_id, SendOptions::default()).await?;
```


## Anti-Phishing and Name Safety

- Four-word addresses are validated against the FWN dictionary and encoding. Because words map through a checksum-bearing scheme, close-word collisions are minimized and detectable.
- Display of endpoints and identities defaults to four-word forms (FW4/FW6) where possible.
- Agents should treat any UI string not derived from four-word encodings as untrusted.


## Error Handling and Telemetry

- All APIs return explicit `Result<T, E>`; production code never panics. Errors include descriptive messages and can carry machine-parsable codes.
- `tracing` emits JSON-structured events for: DHT puts/gets, auth failures, timeouts, stream class usage, and message delivery outcomes.


## Security Notes

- Identity auth: ML-DSA-65 signatures; signatures over canonical content prevent malleability.
- Content encryption: PQC-friendly symmetric crypto (ChaCha20-Poly1305 via saorsa-pqc). Group content can use MLS session keys.
- Sharding: FEC `(k,m,shard_size)` improves resiliency; shard placement uses trust-weighted selection; repairs planned via `repair_request()` and optional friend-mesh rotation.
- Keys and secrets must never be persisted in plaintext. Use secure storage and zeroization where applicable.


## Compatibility and Versioning

- All top-level records are versioned (`v: u8`).
- New fields are added in a backwards-compatible manner. Unknown fields must be ignored by agents.
- Wire-compatibility is ensured through CBOR encoding of canonical records and explicit signature bytes.


## Quick Reference (Calls)

- Identity: `identity_claim`, `identity_fetch`, `identity_publish_endpoints_signed`, `device_publish_forward`, `device_publish_forward_signed`, `device_subscribe`.
- Groups: `group_identity_create`, `group_identity_publish`, `group_identity_fetch`, `group_forwards_put`, `group_forwards_fetch`, `group_put`, `group_fetch`.
- DHT: `dht_put`, `dht_get`, `dht_watch`, `set_dht_instance`.
- Routing & Trust: `record_interaction`, `eigen_trust_epoch`, `route_next_hop`.
- Transport: `quic_connect`, `quic_open`.
- Storage Control: `place_shards`, `provider_advertise_space`, `repair_request`.
- Friend Mesh: `friend_mesh_plan`.


## Implementation Notes for Agents

- Always validate four-word inputs before computing keys.
- Use canonical signing bytes as described for identities, endpoints, and groups.
- For websites, prefer immutable content addresses in manifests; update manifests atomically and then update `website_root` to point to the new object root.
- For large files, use `ContainerManifestV1` + chunked/FEC-sealed storage; place shards via `place_shards` and cache locally for fast group reads.
- For calls, prefer the MessagingService call flows; only reach for raw QUIC when building bespoke transports.