# vantage-live design
## What this is
A `LiveTable` wraps an existing `AnyTable` (the "master") and adds a local
cache. Reads consult the cache first; misses fall through to the master and
populate the cache on the way back. Writes go to the master and invalidate
the cache. Optionally, an external event source (SurrealDB LIVE, a Kafka
topic, anything that can produce `LiveEvent`s) keeps the cache fresh
without polling.
The point is to make UI code non-blocking when it shouldn't be — scrolling
through a list of clients on a phone shouldn't wait for the network on
every page change, and editing a record shouldn't lock the form while the
write is in flight.
```rust
// A regular table, somewhere remote
let clients_remote = Client::surreal_table(db);
// Wrap it. "clients" is the cache key — caller chooses, caller owns.
let clients = LiveTable::new(
AnyTable::from_table(clients_remote),
"clients",
RedbCache::open("./cache.redb")?,
);
// LiveTable implements TableLike, so it slots into AnyTable too —
// UI code doesn't know it's talking to a cache.
let any = AnyTable::new(clients);
```
`LiveTable` implements `TableLike`, so anywhere your code accepts an
`AnyTable` it accepts a `LiveTable`. The wrapping is invisible to
consumers.
## Scope
In v1:
- Implements only the value-set traits (`ReadableValueSet`,
`WritableValueSet`, `InsertableValueSet`, `ActiveRecordSet`). Cache
stores `Record<ciborium::Value>` end-to-end — same shape `AnyTable`
uses.
- Read-side cache, keyed by caller-supplied `cache_key` plus page number.
- Writes routed to master (or a caller-supplied alternative target),
queued on a worker task so the call site doesn't block.
- Sloppy invalidation: any write or live event blows the entire cache
for that `cache_key`.
- Pluggable cache backend (`RedbCache`, `MemCache`, `NoCache`).
- Pluggable event source via the `LiveStream` trait.
Not in v1:
- The entity-shaped traits (`DataSet<E>` / `ReadableDataSet<E>` / etc.).
Callers who want typed reads deserialise the cached `Record<Value>`
themselves; we add the entity layer if a real workload demands it.
- Multi-page glue (when UI ipp > master ipp). The field is stored on
the LiveTable but ignored; we'll wire it in once a real workload
needs it.
- Per-page surgical invalidation. Sloppy is good enough until proven
otherwise.
- `RecordEdit` / snapshot-based dirty tracking. Different concern,
different crate, comes later.
- TTL-based expiry.
## Architecture
### The contract
`LiveTable` implements the standard *value-set* traits from
`vantage-dataset`: `ReadableValueSet`, `WritableValueSet`,
`InsertableValueSet`, and `ActiveRecordSet` (auto-derived from the
previous two). It also implements `TableLike`, so wrapping into
`AnyTable` works without an adapter. **No new public dataset traits.** A
consumer that already speaks `Record<Value>` doesn't need to learn
anything new.
Out of scope for v1: the entity-shaped traits (`DataSet<E>`,
`ReadableDataSet<E>`, `WritableDataSet<E>`, `ActiveEntitySet<E>`).
`AnyTable` is a `Record<Value>`-shaped abstraction anyway, so the cache
operates on records throughout. Entity-level wrapping can come later if
we find a real use case — until then, it's a layer that consumers can
build on top by deserialising records themselves.
Everything below is what the trait impls do internally — a cache lookup
in front of every read, a queue + worker behind every write, a
`LiveStream` keeping the cache honest.
### The struct
```rust
pub struct LiveTable {
master: Arc<RwLock<AnyTable>>,
cache_key: String,
cache: Arc<dyn Cache>,
custom_write_target: Option<Arc<RwLock<AnyTable>>>,
write_queue: mpsc::Sender<WriteOp>, // internal, not pub
live_stream: Option<Arc<dyn LiveStream>>,
// Master ipp is set once at construction. Changing it would
// invalidate every cached page anyway — make a new LiveTable.
master_ipp: Option<i64>,
// Pagination state from set_pagination(); used to compute
// the cache page suffix on each read.
pagination: RwLock<Option<Pagination>>,
}
```
- `master` is `Arc<RwLock<…>>` so swap doesn't break outstanding handles.
- `master_ipp` is immutable after construction; we store it but v1 doesn't
use it (we trust caller to keep UI ipp ≤ master ipp; multi-page glue
comes later).
- `custom_write_target` is `None` by default — writes go to master.
Override when you want writes to land somewhere other than where reads
came from (e.g. a "submissions" table that gets reviewed before merging
into the main one).
### Cache (infrastructure trait, not a dataset trait)
The dataset traits don't have a notion of "cache slot," so this is one
piece of new surface — but it's an internal building block, not a public
contract for consumers.
```rust
#[async_trait]
pub trait Cache: Send + Sync {
async fn get(&self, key: &str) -> Result<Option<CachedRows>>;
async fn put(&self, key: &str, rows: CachedRows) -> Result<()>;
/// Drop everything under a prefix. v1 invalidation calls this with
/// the bare `cache_key` — every page suffix below it goes.
async fn invalidate_prefix(&self, prefix: &str) -> Result<()>;
}
pub struct CachedRows {
pub rows: IndexMap<String, Record<ciborium::Value>>,
pub fetched_at: SystemTime,
}
```
Three impls in v1:
- `RedbCache` — disk-backed, takes a folder. Inside, one redb file
(`vlive.redb`) with one redb table per `cache_key`, namespaced
`__vlive__{cache_key}`. Sub-keys (`page_n`, `id/foo`) are `&str`
inside that table; values are CBOR-encoded `CachedRows`.
`invalidate_prefix(cache_key)` drops the whole redb table — O(1)-ish.
redb's exclusive file lock means one process per cache folder.
- `MemCache` — `Arc<RwLock<HashMap<String, CachedRows>>>`. Fast, fine for
tests and short-lived processes.
- `NoCache` — every method is a no-op / returns `None`. Equivalent to
bypassing the LiveTable wrapper, useful for parity tests.
### Read path — what `list_values` actually does
```rust
impl ReadableValueSet for LiveTable {
async fn list_values(&self) -> Result<IndexMap<Self::Id, Record<Self::Value>>> {
let page = self.pagination.read().clone().unwrap_or_default();
let key = format!("{}/page_{}", self.cache_key, page.get_page());
if let Some(cached) = self.cache.get(&key).await? {
return Ok(cached.rows);
}
let mut master = self.master.write().await;
master.set_pagination(Some(page));
let rows = master.list_values().await?;
self.cache
.put(&key, CachedRows { rows: rows.clone(), fetched_at: now() })
.await?;
Ok(rows)
}
async fn get_value(&self, id: &Self::Id) -> Result<Option<Record<Self::Value>>> {
// Single-row reads skip the page math:
let key = format!("{}/id/{}", self.cache_key, id);
// … same shape as list_values, but cache.get/put on the per-id key …
}
}
```
### Write path — `WriteOp` is private
```rust
// internal — not in lib.rs's public surface
enum WriteOp {
Insert { id: Id, record: Record<Value>, reply: oneshot::Sender<Result<Record<Value>>> },
Replace { id: Id, record: Record<Value>, reply: oneshot::Sender<Result<Record<Value>>> },
Patch { id: Id, partial: Record<Value>, reply: oneshot::Sender<Result<Record<Value>>> },
Delete { id: Id, reply: oneshot::Sender<Result<()>> },
DeleteAll { reply: oneshot::Sender<Result<()>> },
InsertReturnId { record: Record<Value>, reply: oneshot::Sender<Result<Id>> },
}
impl WritableValueSet for LiveTable {
async fn insert_value(&self, id: &Self::Id, record: &Record<Self::Value>)
-> Result<Record<Self::Value>>
{
let (tx, rx) = oneshot::channel();
self.write_queue
.send(WriteOp::Insert { id: id.clone(), record: record.clone(), reply: tx })
.await?;
rx.await?
}
// replace_value / patch_value / delete / delete_all → same pattern
}
impl InsertableValueSet for LiveTable {
async fn insert_return_id_value(&self, record: &Record<Self::Value>) -> Result<Self::Id> {
// same queue dispatch, awaits the InsertReturnId oneshot
}
}
```
The worker task drains the queue, applies each op against
`custom_write_target.unwrap_or(master)`, and on success calls
`cache.invalidate_prefix(&cache_key)`. Failure modes:
- Master rejects the write → the `oneshot` carries the `Err` back to the
caller. Cache stays untouched (no false invalidation).
- Worker panics → all pending `oneshot`s drop, callers get `RecvError`.
Worker is supervised; a new one starts on next `LiveTable::new`.
Fire-and-forget callers wrap `insert_value(...).await` in `tokio::spawn`
and ignore the future — same pattern that works on any `WritableValueSet`.
### LiveStream trait
```rust
#[async_trait]
pub trait LiveStream: Send + Sync {
fn subscribe(&self) -> Pin<Box<dyn Stream<Item = LiveEvent> + Send>>;
}
pub enum LiveEvent {
Changed, // generic "something moved"
Inserted { id: String },
Updated { id: String },
Deleted { id: String },
}
```
V1 treats every event the same: invalidate the whole `cache_key`. The id
variants exist for future surgical invalidation; the generic `Changed`
covers stream sources that don't deliver row-level detail.
Implementations live in their own crates / modules:
- `vantage-surrealdb` provides `SurrealLiveStream` over LIVE queries.
- A future `vantage-kafka` or app code can ship a `KafkaLiveStream`.
- Tests use a `manual` stream that lets the test push events in.
### TableLike + AnyTable
`LiveTable` implements `TableLike`, so:
```rust
let live = LiveTable::new(any_master, "clients", cache);
let any = AnyTable::new(live); // drop-in for UI / generic code
```
`TableLike` metadata methods (`table_name`, `columns`, `id_field`,
`references`) pass through to master under a read lock. `set_pagination`
stores into LiveTable's own `pagination` field — that's how the cache
key gets the right page suffix.
## Testing strategy
### The fixture
Master is a real SurrealDB seeded with the bakery dataset (the same
`scripts/start.sh` + `scripts/ingress.sh` pattern that `vantage-surrealdb`
already uses). Cache is `MemCache` for fast inner-loop tests, plus a
smaller `RedbCache` suite for the "does it survive a process restart"
case. Both are dev-dependencies — neither lives in the runtime crate.
```toml
[dev-dependencies]
vantage-redb = { path = "../vantage-redb" }
vantage-surrealdb = { path = "../vantage-surrealdb" }
bakery_model3 = { path = "../bakery_model3" }
tempfile = "3"
testcontainers = "..." # optional — see below
tokio = { version = "1", features = ["full", "test-util"] }
```
Two ways to get SurrealDB up:
- **Manual**: assume `scripts/start.sh` ran, point at `localhost:8000`.
Tests skip cleanly if no server is reachable. Same convention as
`vantage-mongodb`'s `MONGODB_URL`. Fast feedback during development.
- **testcontainers**: a `surrealdb-test-helper` module spins up a
fresh container per test class, runs ingress, hands back a `SurrealDB`
handle. Slow but hermetic — used in CI.
The helper picks based on `LIVE_TEST_MODE=container|local`, defaulting
to `local`. CI sets `container`.
### Test layout
Step-numbered like vantage-mongodb / vantage-redb:
```
tests/
├── 1_cache_trait.rs MemCache + RedbCache contract — get/put/invalidate_prefix
├── 1_live_event.rs LiveEvent matchers, manual stream wiring
├── 2_live_table_read.rs miss → master fetch → cache populated → hit
├── 2_pagination.rs different pages stored under different keys; ipp immutability
├── 3_live_table_write.rs insert/replace/patch/delete → master + cache invalidated
├── 3_custom_write_target.rs writes route to alternate table, reads stay on master
├── 3_queue_concurrency.rs N concurrent writers, ordering, oneshot reply correctness
├── 4_live_stream.rs manual stream pushes; cache invalidated on each event
└── 5_anytable_wrap.rs LiveTable wrapped via AnyTable, used through TableLike
```
The `1_*` files are pure-unit (no SurrealDB). `2_*` and up bring up
the real server.
### What each test category proves
- **Cache trait**: round-trip, TTL-shape data, prefix invalidation
matches the trait contract on every backend.
- **Read**: cache miss path populates; second read same key is a hit
(verified by stopping master mid-test and checking reads still
succeed). Single-row `get_value` keys differently from list pages.
- **Pagination**: `set_pagination(page=1)` and `set_pagination(page=2)`
produce distinct cache entries; constructing a LiveTable with
`master_ipp` doesn't change behaviour in v1 but the value is
retrievable.
- **Write**: write hits master, cache for `cache_key` is empty after,
next read repopulates from master. `custom_write_target` routes
writes elsewhere — reads on the LiveTable still see master.
- **Queue concurrency**: 100 concurrent `insert_value` calls all get
their own `oneshot::Sender` reply, no cross-talk. FIFO ordering on
the master.
- **Live stream**: `ManualLiveStream::push(LiveEvent::Updated{...})`
triggers cache invalidation; next read re-fetches from master.
Tested with both `MemCache` and `RedbCache`.
- **AnyTable**: same scenarios but go through `AnyTable::new(live)`,
proving the wrapper is invisible.
### Helpers in `tests/common/`
```
tests/common/
├── mod.rs re-exports
├── surreal.rs bring up master, run ingress, hand back AnyTable
├── manual_stream.rs ManualLiveStream — pushes events on demand
└── fixtures.rs seed test data, shared assertions
```
Same pattern as vantage-mongodb's `tests/common/`.
## Observability
Multi-layer code is hard to debug without spans. A single
`live.list_values()` call walks through: pagination state, cache key
build, cache backend lookup, possibly a master read, possibly a cache
write. When something goes wrong — stale cache, double-invalidation,
queue stuck, missed live event — staring at error messages won't tell
you which layer dropped the ball.
`tracing` (the crate) gives us spans + structured logs without baking in
a logger. Apps wire up whatever subscriber they want; we just emit.
### Dependencies
```toml
[dependencies]
tracing = "0.1"
```
`tracing-subscriber` only as dev-dep / in the CLI example, not in the
library. Tests set up a subscriber once via `tracing_subscriber::fmt::try_init()`
inside a `ctor`-style helper.
### Span boundaries
```rust
#[tracing::instrument(skip(self), fields(cache_key = %self.cache_key, page))]
async fn list_values(&self) -> Result<...> { ... }
#[tracing::instrument(skip(self, record), fields(cache_key = %self.cache_key, id = %id))]
async fn insert_value(&self, id: &Self::Id, record: &Record<Self::Value>) -> Result<...> { ... }
#[tracing::instrument(skip_all, fields(cache_key = %self.cache_key))]
async fn worker_loop(...) { ... }
#[tracing::instrument(skip_all, fields(event_kind))]
async fn handle_live_event(&self, event: LiveEvent) { ... }
```
Five span-worthy boundaries:
- **Read path** (`list_values`, `get_value`, `get_some_value`)
- **Write path** entry point (the `insert_value` etc. methods that
enqueue and await)
- **Worker loop** (drains queue, applies to master, invalidates cache)
- **Live stream loop** (consumes events, invalidates cache)
- **Cache backend operations** (one span per `get`/`put`/
`invalidate_prefix` so RedbCache slowness is visible)
### Event levels
- `error!` — only when something went wrong that the caller will see as
an `Err`. No silent error swallowing.
- `warn!` — recovery cases: cache backend errored on `put` (we still
served the master result), worker restarted after panic.
- `info!` — lifecycle: LiveTable constructed, master swapped, live
stream connected/disconnected.
- `debug!` — every cache hit/miss, every queue enqueue/drain, every
live event invalidating.
- `trace!` — full record contents on writes, full row counts on reads.
Off by default; `RUST_LOG=vantage_live=trace` flips it on.
### Structured fields
Every log line carries enough context to grep:
- `cache_key` — which LiveTable.
- `page` — which page on read paths.
- `id` — which row on write paths.
- `op` — `insert | replace | patch | delete | delete_all | insert_return_id`.
- `outcome` — `hit | miss | populated | invalidated | failed`.
Avoid logging the master's connection string, full record bodies (use
`trace!` for those), or anything that could leak credentials.
### Test integration
```rust
// tests/common/mod.rs
pub fn init_tracing() {
use tracing_subscriber::EnvFilter;
let _ = tracing_subscriber::fmt()
.with_env_filter(EnvFilter::from_default_env())
.with_test_writer()
.try_init();
}
```
Each test calls `common::init_tracing()` once. `cargo test` shows
nothing by default; `RUST_LOG=vantage_live=debug cargo test -- --nocapture`
shows the cache/queue dance.
## CLI example
`bakery_model4/examples/cli4.rs` is the reference: `db <entity>
<commands>` with `field=value` filters, `list / get / add / delete /
ref` verbs, YAML-driven entity registry. We mirror it almost exactly,
adding one thing — the cache layer is wired in between the entity
constructor and the command handler. Same UX, with cache-hit/miss
visible via `--debug`.
```
examples/
└── live_cli.rs
```
Invocation, identical surface to cli4 plus a couple of cache-related
flags:
```
db <entity> <commands>...
Flags:
--debug Show traced cache/queue activity
--cache <path> Use a redb file (default: in-memory)
--no-cache Bypass cache entirely (parity check vs cli4)
```
Sketch of the wiring (the only part that differs from cli4):
```rust
async fn run() -> Result<()> {
let config = VantageConfig::from_file("bakery_model4/config.yaml")?;
let db = connect_surrealdb_with_debug(matches.get_flag("debug")).await?;
if let Some(entity_name) = matches.get_one::<String>("entity") {
// Build the master table the same way cli4 does
let master = get_table(&config, entity_name, db)?;
let any_master = AnyTable::from_table(master);
// Pick a cache backend from flags
let cache: Arc<dyn Cache> = match (matches.get_one::<String>("cache"),
matches.get_flag("no-cache")) {
(_, true) => Arc::new(NoCache),
(Some(path), false) => Arc::new(RedbCache::open(path)?),
(None, false) => Arc::new(MemCache::default()),
};
// Wrap. Cache key matches the entity name — easy mental model.
let live = LiveTable::new(any_master, entity_name, cache);
// Wrap into AnyTable so handle_commands can be exactly cli4's body
let any_live = AnyTable::new(live);
handle_commands(any_live, commands).await?;
}
Ok(())
}
```
`handle_commands` is copied verbatim from cli4 but typed against
`AnyTable` instead of `Table<SurrealDB, EmptyEntity>`. Since LiveTable
implements `TableLike`, every cli4 verb (`list`, `get`, `add`,
`delete`, `field=value`, `ref`) keeps working without any awareness of
the cache.
### What the example demonstrates
- Plain reads: `db client list` runs once, then a second time and
shows a cache hit (visible only with `--debug`).
- Conditioned reads: `db client name=Marty list` — caller is
responsible for picking a different `cache_key` if they want this
cached separately. v1 doesn't, so this hits the master each time.
(Documented gotcha; the example surfaces it.)
- Pagination: `db product page=2 list` — different cache slot.
- Writes: `db bakery add "foo" '{"name":"X"}'` invalidates the
bakery cache; next `list` re-fetches.
- `--cache ./data.redb`: same flow, but cache survives between
invocations. Run twice in a row, second one is hot from disk.
The example doubles as a manual test harness for the cache layer
during development.