# Storage & CortEX — how to use it
This doc is the **user-level narrative** for Net's single-node storage
stack: RedEX (the log), CortEX (the fold driver and domain models), and
NetDB (the query façade). For the design rationale + implementation
plans see [`REDEX_PLAN.md`](REDEX_PLAN.md), [`CORTEX_ADAPTER_PLAN.md`](CORTEX_ADAPTER_PLAN.md),
and [`NETDB_PLAN.md`](NETDB_PLAN.md). This doc is "how do I *use* it and
what do I need to know when I do."
Target reader: an engineer writing a daemon that reads + writes
mesh-bound state.
## The three layers
```
┌──────────────────────┐
│ NetDB (query / watch façade) db.tasks / db.memories
└──────────┬───────────┘
│ query filters, snapshot, snapshot_and_watch
┌──────────┴───────────┐
│ CortEX (fold driver + domain models)
└──────────┬───────────┘
│ RedexFold<State> + change broadcast
┌──────────┴───────────┐
│ RedEX (append-only log)
└──────────────────────┘
```
- **RedEX** is the primitive: a named monotonic log (`ChannelName` →
`RedexFile`). 20-byte index records, inline-or-heap payloads, a
tail API that delivers events in order. Single-node. Optionally
disk-backed via the `redex-disk` feature.
- **CortEX** drives folds over RedEX tails. A domain model (`tasks`,
`memories`) implements `RedexFold<State>`, which mutates state as
events arrive. The adapter owns the `Arc<RwLock<State>>` and a
change-broadcast channel for reactive consumers.
- **NetDB** bundles CortEX adapters under one handle with a unified
snapshot / restore surface. Queries go through the state; writes
go through the adapter's typed API.
Stream API surface (names as they appear in Rust; TS/Python mirror
them with language-native conventions):
| Query current state | `state.query().where_*().collect()` | `Vec<T>` |
| One-shot lookup by id | `state.find_unique(id)` | `Option<&T>` |
| Filter → set | `state.find_many(&filter)` | `Vec<T>` |
| Count / exists | `state.count_where(&f)` / `state.exists_where(&f)` | `usize` / `bool` |
| React to changes | `adapter.watch().where_*().stream()` | `Stream<Item = Vec<T>>` |
| Snapshot + react | `adapter.snapshot_and_watch(watcher)` | `(Vec<T>, Stream<Item = Vec<T>>)` |
| Full-state capture | `adapter.snapshot()` | `(Vec<u8>, Option<u64>)` |
| NetDB whole-stack | `NetDb.snapshot()` / `NetDb.open_from_snapshot(...)` | one postcard blob covering every enabled model |
## Choosing durability (RedEX `FsyncPolicy`)
Set via `RedexFileConfig::default().with_fsync_policy(...)`:
```rust
use net::adapter::net::{FsyncPolicy, RedexFileConfig};
use std::time::Duration;
let cfg = RedexFileConfig::default()
.with_persistent(true)
.with_fsync_policy(FsyncPolicy::EveryN(100));
```
| `Never` (default) | Tail since last close / explicit `sync()` | Telemetry, caches, best-effort logs |
| `EveryN(N)` | ≤ `N − 1` entries from the last sync point | Most application state (try `EveryN(100)`) |
| `Interval(d)` | ≤ `d` seconds of writes | Anything that must survive kernel panic |
Invariants you can rely on:
- `close()` **always** fsyncs, regardless of policy.
- `RedexFile::sync()` always fsyncs — the explicit durability barrier.
- Torn-write recovery is fsync-independent: the dat-before-idx write
order + reopen-time truncation handle arbitrary partial writes.
- `EveryN(0)` and `EveryN(1)` both mean "fsync every append."
- `Interval` spawns one tokio background task per file, cancelled on
`close()`.
## Querying
`NetDb` exposes per-model state handles via `db.tasks()` / `db.memories()`
(Rust) or `db.tasks` / `db.memories` (TS/Python property accessors). The
state behind each is held behind an `Arc<RwLock<State>>`; queries take a
brief read lock, scan the state, and return owned values.
```rust
let pending = db.tasks().state().read()
.query()
.where_status(TaskStatus::Pending)
.order_by(OrderBy::CreatedDesc)
.limit(20)
.collect();
let found = db.tasks().state().read().find_unique(id);
let count = db.memories().state().read().count_where(&filter);
```
Query performance at a glance (Apple M1 Max, `cargo bench --bench cortex`):
- `find_unique`: ~9 ns.
- `find_many` on 1 K tasks (status filter): ~8 µs.
- `find_many` on 10 K tasks: ~140 µs.
- `count_where` on 10 K tasks: ~30 µs.
- `find_many` on 1 K memories (tag filter): ~50 µs.
At 10 K state size, filter queries stay in double-digit microseconds —
cheap enough to call inside a hot loop. Write performance is separate:
tasks ingest runs at ~3.6 M events/sec before consumer backpressure.
## Watching
`adapter.watch()` returns a builder. Chain filters the same way you
chain query filters; then call `.stream()` to start emitting.
```rust
use futures::StreamExt;
let mut stream = Box::pin(
db.tasks()
.watch()
.where_status(TaskStatus::Pending)
.order_by(OrderBy::CreatedDesc)
.stream(),
);
while let Some(current_pending) = stream.next().await {
// current_pending is the freshly-evaluated filter result.
// Only delivered when the result actually changed.
}
```
Semantics:
- **Initial emission**: the first `.next()` returns the current filter
result (immediately, no network round trip).
- **Deduplication**: subsequent emissions only fire when the filter
result differs from the previous one — renames on irrelevant rows
don't wake your consumer.
- **Single-slot channel**: if the consumer falls behind a fast fold
task, intermediate filter results are dropped; the consumer sees
the latest state on the next poll. "Drop intermediate, final state
is correct."
- **Default ordering**: if you don't set `order_by`, the watcher
defaults to `IdAsc` so Vec-equality dedup is deterministic.
- **Cancellation**: drop the stream to stop. The watcher's internal
task observes `tx.closed()` and exits.
### Snapshot + watch combo
UI consumers usually want "paint what's there now, then react to
changes." `adapter.snapshot_and_watch(watcher)` is the one-liner:
```rust
let watcher = db.tasks().watch().where_status(TaskStatus::Pending);
let (snapshot, mut stream) = db.tasks().snapshot_and_watch(watcher);
// `snapshot` is the current filter result.
render(&snapshot);
// `stream` emits deltas from this point forward — no duplicate of
// the initial state.
while let Some(delta) = stream.next().await {
render(&delta);
}
```
The initial emission is consumed internally (via `skip(1)`) so the
caller doesn't double-render.
## Snapshot + restore
Per-model:
```rust
let (state_bytes, last_seq) = db.tasks().snapshot()?;
// Persist (bytes, last_seq) somewhere durable (disk, cloud, etc.)
let tasks = TasksAdapter::open_from_snapshot(
&redex,
origin_hash,
&state_bytes,
last_seq,
)?;
// Replay picks up at last_seq + 1; earlier events don't re-fold.
```
Whole-DB (covers every enabled model in one postcard blob):
```rust
let bundle = db.snapshot()?;
let db2 = NetDb::open_from_snapshot(&redex, bundle, builder_config)?;
```
Bundle encode/decode timings (postcard, 1 K entries): ~23 µs encode,
~27 µs decode. Bundles are 60-70% smaller than the pre-postcard
bincode format.
Known limitation: **postcard schema breaks on field-type changes.**
If you add or remove a field on `Task` / `Memory` / a new CortEX
model, old snapshots don't deserialize. Re-snapshot on upgrade.
## Restart behavior
On process restart with a persistent RedEX file:
1. `Redex::with_persistent_dir(...)` reconnects to the base
directory.
2. `TasksAdapter::open(&redex, origin_hash)` (or `open_from_snapshot`
if you saved state bytes) reads the tail from RedEX.
3. The fold task replays events — **every event if there's no
snapshot**; only post-snapshot events if you restored from one.
4. `adapter.wait_for_seq(seq)` is a read-after-write barrier — await
it before querying to guarantee the fold caught up.
Design choices that matter here:
- RedEX files default to strictly local — there's no cross-node
fallback if local disk is wiped. Opt in to cross-node replication
per channel via
`RedexFileConfig::with_replication(Some(ReplicationConfig::new()))`;
see [`CONFIG_REPLICATION.md`](CONFIG_REPLICATION.md) for the full
operator surface.
- Retention evicts head events from memory but (in `redex-disk`
today) leaves them on disk. v2's mmap tier reconciles the two.
- Age-based retention (`retention_max_age_ns`) resets on persistent
reopen: recovered entries get "now" as their timestamp. If age
retention matters across restarts, include a timestamp in the
payload.
## What's coming (v3 territory)
The following are designed but not shipped in v2. Knowing the shape
helps you structure daemons so adopting them later is straightforward.
- **Hot → warm mmap tiering** ([`REDEX_V2_PLAN.md`](REDEX_V2_PLAN.md)).
A separate mmap-backed segment for frozen history, with transparent
reads across hot heap + warm mmap. Reduces memory pressure on
long-lived files; unblocks ~billions of retained events.
- **`ColdStore` trait** + `read_cold(start, end)`. Archive-tier
interface that reopens old segments from e.g. object storage. Per
the v2 closeout plan, held out of v2 pending real-world demand.
- **Secondary indices on CortEX models**. The `RedexIndex<K, V>`
primitive exists today (see `redex::index`); wiring it into
`MemoriesAdapter` as a tag index is pending a domain-model tweak
(`MemoryRetaggedPayload` needs to carry old + new tags so the
projection can emit symmetric insert/remove ops).
- **NetDB watchers in the Go / C bindings** (`db.tasks.watch(filter)` /
`snapshotAndWatch`). The Rust surface exists today
(`adapter.watch()` / `adapter.snapshot_and_watch(...)`) and the napi
+ pyo3 wrappers ship in `bindings/node` + `bindings/python`; the Go
and C bindings don't have a NetDB adapter surface at all yet, so
this is gated on a `cortex-ffi` crate landing first.
## FAQ
**Q: Do I need CortEX if I just want an append-only log?**
No. RedEX alone gives you `append` + `tail` + `read_range`. CortEX
exists to materialize folded state from the log, which you only need
if you want to query "what does the current state look like" rather
than "what events arrived."
**Q: Can two processes share a persistent RedEX directory?**
No. v1 assumes single-writer. The manager doesn't file-lock; multiple
writers corrupt the idx/dat pair.
**Q: How do I reset a corrupted file?**
Close the adapter, delete the `<base>/<channel_path>/{idx,dat}` pair,
reopen. RedEX creates fresh files.
**Q: What happens if the fold panics?**
Under `FoldErrorPolicy::Stop` (default), the fold task exits and
subsequent `wait_for_seq` calls hang. Under `LogAndContinue`, the
event is skipped and state keeps advancing. Choose based on whether
you'd rather fail fast or survive bad events.
**Q: Can I swap `FsyncPolicy` on a live file?**
Not today. The policy is set at `open_file` time and frozen for the
file's lifetime. Close + reopen with a different policy.
**Q: Where do benchmarks live?**
Top-level `README.md` and `net/crates/net/README.md` both carry the
perf tables. Source: `cargo bench --bench cortex` /
`cargo bench --bench redex`.