Expand description
Server-side per-row LoroDoc cache with snapshot persistence.
For CRDT-backed entities (crdt: true in the manifest, the default),
every row corresponds to one LoroDoc. This store owns those docs
in memory, hydrates them on demand from a sidecar SQLite table,
write-throughs every commit, and projects the doc state into the JSON
shape Pylon’s existing storage layer expects.
§Persistence shape
Single sidecar table:
CREATE TABLE _pylon_crdt_snapshots (
entity TEXT NOT NULL,
row_id TEXT NOT NULL,
snapshot BLOB NOT NULL,
updated_at TEXT NOT NULL,
PRIMARY KEY (entity, row_id)
);Snapshots are full-state Loro snapshots (ExportMode::Snapshot).
Loro applies internal compaction so the snapshot size stays bounded;
we don’t track an op log separately.
§In-memory cache
Active rows live in a HashMap<(entity, row_id), Arc<Mutex<LoroDoc>>>.
First access for a row hydrates the doc from the sidecar (or creates
a fresh one). Subsequent accesses reuse the in-memory doc — required
both for correctness (Loro’s CRDT identity is per-doc-instance) and
perf (snapshot decode is ~100µs per row).
No eviction yet. Working sets up to ~100K active rows are fine on commodity hardware (~5-50 MB). For larger working sets a follow-up adds LRU eviction with snapshot reload on next access.
§Bandwidth: full snapshot per write (TODO)
Every CRDT-mode write triggers a binary WS broadcast carrying the row’s full current snapshot, not just the incremental update. Loro’s compaction bounds individual snapshots, but the per-write cost still scales with total state size, not write size.
Concrete numbers:
| Workload | Snapshot/row | Per-write fanout |
|---|---|---|
| Chat message | ~200 B | tiny |
| Boring CRUD record | ~500 B | tiny |
| Whiteboard with 1k strokes | ~30 KB | uncomfortable |
| Document with 50K-char body | ~80 KB | bad |
Multiply by connected_clients × writes_per_second to get total
broadcast bandwidth. For chat-shaped workloads it’s free. For collab
whiteboards / large documents it bites once you pass ~10 connected
clients on a hot row.
§Switching to incremental updates
Loro already supports export(ExportMode::updates(version_vector))
returning only the ops a peer hasn’t seen — the building block is
there. What’s missing is the per-client tracking:
- Subscribe protocol — clients tell the server “I want updates for rows X, Y, Z” instead of every CRDT write fanning out to every client. Pylon’s existing room layer is the natural transport once room semantics extend to per-row subscriptions.
- Server-side state —
(client_id, entity, row_id) → version_vectorso the server knows what each client is missing. Bounded by the subscribe set; LRU-evicted with the doc cache. - Encoder swap —
notify_crdtcallsencode_update_since(vv)instead ofencode_snapshot()and ships frame type0x11(CRDT_FRAME_UPDATE) instead of0x10(CRDT_FRAME_SNAPSHOT). Wire format already reserves both bytes. - New-subscriber bootstrap — first frame is still a snapshot
(
0x10), subsequent frames are deltas (0x11).
Estimated effort: ~2 days for a working slice plus a week of production hardening (correct VV tracking under reconnects, garbage-collecting subscriptions on disconnect, handling missed frames via resync request).
Until then this implementation is fine for chat / boring CRUD / demo workloads. Don’t run a Figma clone on it.
Structs§
- Loro
Store - Server-side per-row LoroDoc cache + persistence layer.
Enums§
Constants§
- CREATE_
SIDECAR_ SQL - SQL to create the snapshot sidecar. Idempotent. Called by Runtime
constructor for any database where CRDT mode could be in use (always,
since
crdt: trueis the default).
Functions§
- ensure_
sidecar - Create the sidecar table. Safe to call repeatedly.