# Distributed Sync
The [`triblespace-net`](https://github.com/triblespace/triblespace-rs/tree/main/triblespace-net)
crate adds peer-to-peer synchronization over [iroh](https://www.iroh.computer/):
gossip for HEAD announcements, a DHT for content discovery, direct QUIC
for bulk transfer. The user-visible surface is a single wrapper type —
`Peer<S>` — that makes any triblespace store also a node on a
distributed graph, without changing how the storage traits look from
outside.
Enable it through the facade crate's `net` feature:
```toml
[dependencies]
triblespace = { version = "x.y.z", features = ["net"] }
```
```rust,ignore
use triblespace::net::peer::{Peer, PeerConfig};
```
## Mental Model
`Peer<S>` takes any `S: BlobStore + BlobStorePut + BranchStore<Blake3>`
and wraps it into a node that participates in the iroh network. Two
layers of behavior are bolted onto the normal storage trait calls:
- **Reads auto-drain incoming gossip.** Every call through `reader()`,
`head(id)`, or `branches()` transparently pulls any pending
`NetEvent`s from the network thread into the wrapped store and
re-publishes any deltas from external writers (e.g. another process
appended to the same pile file). Mirrors `Pile::refresh` — the
explicit `Peer::refresh` method is available for tight loops, but
normal storage use Just Works.
- **Writes auto-publish.** Calls through `put` / `update` delegate to
the inner store and then announce blobs to the DHT and gossip branch
HEADs to the topic mesh, all via the background network thread.
The network thread is a private implementation detail: `Peer::new`
spawns it; `Peer::drop` winds it down. Async stays jailed inside that
thread — the storage traits stay sync.
```rust,ignore
let pile = triblespace::core::repo::pile::Pile::open(path)?;
let peer = Peer::new(pile, signing_key, PeerConfig {
peers: vec![bootstrap_endpoint_id],
gossip_topic: Some("my-team-graph".into()),
});
let mut repo = Repository::new(peer, signing_key, TribleSet::new())?;
// From here it's just a Repository — commit, push, pull, query.
```
## Tracking Branches
When a peer learns about a remote HEAD — via gossip arrival or an
explicit `track` call — it materializes the data as a **tracking
branch**: a local branch whose metadata carries `tracking_remote_branch`
(the remote branch id), `tracking_peer` (the publisher's key), and
`remote_name` (instead of the usual `metadata::name`). This keeps
tracking branches invisible to normal discovery: `ensure_branch(name)`
won't find them, `lookup_branch(name)` returns only your own branches,
and the `is_tracking_branch` filter lets the Peer avoid re-gossiping
its mirrors back to the network.
Tracking branches are your sandbox for remote state. Merging them into
your own same-named branch is how you "accept" the remote changes (see
the *Merge Flow* section below).
## Transports
Three protocols ride on the same iroh endpoint:
- **Gossip mesh** (HyParView + PlumTree via `iroh-gossip`): all peers
on the same topic receive every branch HEAD announcement. 81-byte
messages: a 1-byte tag, 16-byte branch id, 32-byte HEAD hash,
32-byte publisher key. Eventual delivery; duplicates deduped on
the wire.
- **DHT** (via `iroh-dht`): content discovery for blobs. On write,
`announce_provider(blob_hash)` tells the DHT "I have this blob." On
read, `find_providers(blob_hash)` returns peers to fetch from.
Content-addressed by design — any provider with the right bytes
passes blake3 verification.
- **Direct QUIC RPC** (`PILE_SYNC_ALPN = "/triblespace/pile-sync/3"`):
point-to-point operations that don't fit the gossip model —
listing a peer's branches, asking for a specific branch's HEAD,
fetching a single blob by hash, enumerating a blob's child
references. One stream per operation, stream FIN signals end, nil
sentinels (zero branch ids / zero hashes) terminate sequences.
## `track` vs `fetch`
Two primitives cover the two levels of "go get this":
- `peer.track(endpoint_id, branch_id)` — fire-and-forget. Opens a
QUIC stream to the remote, asks for its HEAD, then walks the
reachable closure of blobs (BFS over the parent-to-children graph
via `op_children`, pulling each blob through DHT-first then
peer-fallback). When the whole closure has landed locally, emits a
`NetEvent::Head` that the Peer drains into a freshly-materialized
tracking branch. The tracking branch only advances **after** every
referenced blob is in the pile — external readers either see the
old HEAD (with its complete closure) or the new HEAD (with its
closure), never a half-torn state.
- `peer.fetch::<T, Sch>(endpoint_id, handle)` — blocking single-blob
RPC. Pass a typed handle, pick what comes out: `Blob<Sch>` for
bytes-only with zero decode cost, or the decoded type (`TribleSet`,
`anybytes::View<str>`, etc.) for the deserialized value. The bytes
land in the wrapped store via `BlobStorePut::put` and the return
value is decoded from those same bytes.
For the common "pull a branch by name" workflow, `peer.pull_branch(
endpoint_id, name)` composes them: list the remote's branches, pull
each metadata blob via `fetch`, query for `metadata::name`, find the
match, hand off to `track`, block until the tracking branch
materializes. Returns the local tracking branch id ready to merge.
## Merge Flow
Once a tracking branch exists, merging it into its same-named local
branch is the normal Repository workflow plus one helper:
```rust,ignore
use triblespace::net::tracking::{merge_tracking_into_local, MergeOutcome};
match merge_tracking_into_local(&mut repo, tracking_id, "main")? {
MergeOutcome::Empty => { /* tracking had no head yet */ }
MergeOutcome::UpToDate => { /* local already at that state */ }
MergeOutcome::Merged { new_head } => {
// local "main" advanced — either fast-forward or a real
// merge commit, decided by Workspace::merge_commit.
}
}
```
Under the hood that's `ensure_branch("main")` + `pull` tracking
workspace + `pull` local workspace + `merge_commit(remote_head)` +
conditional `push`. The `merge_commit` call picks no-op /
fast-forward / merge commit based on ancestor-walking.
For long-running sync daemons, the same helper runs in a loop over
every tracking branch on every refresh tick.
**Convergence rounds.** When two peers diverge on the same branch:
- *Sequential gossip* (one peer's merge lands before the other's starts)
converges in one round-pair. The first side produces a merge commit
`AM` containing both original commits as parents; the second side
sees `AM`, finds its own head in `ancestors(AM)`, and fast-forwards.
- *Parallel gossip* (both peers merge before either sees the other's
merge) also converges in one round-pair — and without producing a
merge commit on the second side. Merge commits are **content-addressed**:
they carry no author-specific bits (no signature, no `created_at`,
entity id derived intrinsically from the parent set via `entity!`'s
content-hash form), so two peers merging the same parent set produce
bit-identical merge commits that dedup via blob hash alone.
Either way the system converges in one round-pair. The tests in
`triblespace-net/tests/two_peer_convergence.rs` exercise both cases
and serve as regression coverage for the property. Content-addressed
merges are also why `merge_tracking_into_local` is safe to run in a
tight polling loop without worrying about merge-commit churn.
## Ordering Under Pressure
Gossip is eventually consistent, which means a flood of HEAD updates
can arrive out of order: HEAD_1 → HEAD_2 → HEAD_3 where HEAD_1's
closure happens to take longer over the DHT and completes *after*
HEAD_3 has already advanced the tracking branch. Without protection,
HEAD_1 would clobber HEAD_3 and the branch would regress.
To prevent this, `branch_metadata` stamps every published branch
metadata blob with `metadata::updated_at: NsTAIInterval` from
`Epoch::now()`. TAI is strictly monotone (no leap-second jumps).
`update_tracking_branch` reads the stamp from both the current and
incoming metadata and rejects updates whose timestamp is not strictly
newer — logged as `[tracking] skip stale update for branch <bid>`
for observability. The synthesized tracking branch metadata mirrors
the remote's timestamp so subsequent comparisons share a reference
frame.
Tradeoff: publishing the same HEAD twice at different moments produces
different metadata blob hashes now (the timestamps differ). Gossip
convergence degrades slightly — duplicate blobs for the same semantic
state — but correctness is preserved and regressions are eliminated.
## CLI Surface
The `trible` CLI exposes sync via the `pile net` subcommand:
```
trible pile net identity [--key PATH]
Print this node's iroh identity (generates a key if needed).
trible pile net sync <PILE> [--peers ...] [--topic T] [--key PATH]
Long-running bidirectional sync. Without --topic, serves only
(accepts direct pulls but doesn't gossip). With --topic, joins
the gossip mesh and auto-merges incoming tracking branches into
same-named local ones every tick.
trible pile net pull <PILE> <REMOTE> --branch NAME [--key PATH]
One-shot pull of a named branch from a specific peer (REMOTE is
the peer's iroh node id, 64-char hex). Pull-only mode — no gossip
subscription, direct QUIC + DHT fetch, materialize a tracking
branch, merge into local. Useful for "give me a copy of that
project" workflows.
```
## What's Deferred
A few structural improvements the design discussion has surfaced but
that aren't implemented yet:
- **Incremental commit-chain advance.** Today the tracking branch only
moves when the whole reachable closure of a HEAD is local. Under
sustained gossip pressure on large histories, we could fall
arbitrarily behind. A git-like incremental walker (parallel-fetch
commit contents, advance the tracking branch commit-by-commit in
topological order) would give steady progress at the cost of
exposing intermediate states to readers.
- **`CachingStore<P>` with on-miss fetch.** A middleware that wraps a
`Peer` and does DHT-backed on-miss fetching inside `BlobStoreGet::get`,
with a policy callback for gating by size / schema / context. Would
cover the "cache eviction + lazy fetch" workflows that current
eager-only semantics can't.
- **Schema-aware traversal in `track`.** `op_children` today scans
parent blob bytes for 32-byte chunks that look like hashes. That's
cheap and peer-agnostic but pulls more than strictly necessary when
a blob contains handle-sized non-hash data. A schema-aware walker
that parses each blob as its declared schema and enumerates referenced
handles could be precise, but adds significant traversal complexity.
All three are additive: the current model stays correct as a
strict-closure / eager-only baseline that these improvements build on.