Expand description
Replication Module
Implements single-primary, multi-replica replication via WAL streaming.
§Architecture
- Primary: accepts writes and streams WAL records to replicas
- Replica: read-only, connects to primary for WAL streaming
- Initial sync via snapshot transfer, then incremental WAL
§Usage
ⓘ
// Primary
let options = RedDBOptions::persistent("./primary-data")
.with_replication(ReplicationConfig::primary());
// Replica
let options = RedDBOptions::persistent("./replica-data")
.with_replication(ReplicationConfig::replica("http://primary:50051"));Modules§
- bookmark
- Causal bookmark token helpers.
- cdc
- Change Data Capture (CDC) — stream of database change events.
- commit_
policy - Primary commit policies (PLAN.md Phase 11.4).
- commit_
waiter - Synchronous commit waiter (PLAN.md Phase 11.4 —
ack_n). - failover
- Coordinated zero-RPO failover (issue #833, PRD #819).
- flow_
control - Write-admission flow control keyed on in-quorum replica lag (issue #826).
- lease
- Serverless writer lease (PLAN.md Phase 5 / W6).
- logical
- Logical replication helpers shared by replica apply and point-in-time restore.
- primary
- Primary-side replication: WAL record production and snapshot serving.
- quorum
- Quorum-based commit coordination (Phase 2.6 multi-region PG parity).
- replica
- Replica-side replication: connects to primary, consumes WAL records.
- scheduler
- Backup Scheduler — automatic periodic snapshots with optional remote upload.
- swap_db
- Stay-readable re-bootstrap with an atomic dataset swap (issue #837, PRD #819).
- topology_
advertiser - Server-side
TopologyAdvertiser(issue #167).
Structs§
- Causal
Bookmark - Commit
Waiter - Failover
Coordinator - The coordinated zero-RPO failover state machine.
- Failover
Node - A node participating in a failover.
- Failover
Outcome - The result of a completed handover.
- Failover
Request - A request to hand the primary role from
old_primarytotarget. - Flow
Controller - Ticket-based write-admission flow controller.
- LagConfig
- Knobs for the lag/health computation. Kept as a small struct so
the call sites (gRPC
topologyRPC, RedWire HelloAck builder) thread the same defaults without each one redeclaring constants. - Lease
Store - Wraps an
AtomicRemoteBackendwith lease primitives. The lease object is stored under a deterministic key derived fromdatabase_key; the store reads/writes that one key. - Quorum
Config - Quorum configuration stored alongside
ReplicationConfig. - Quorum
Coordinator - Tracks per-replica region bindings and pairs them with the primary’s
ack map.
PrimaryReplicationowns the WAL buffer +ReplicaStatelist; this coordinator adds the region dimension and the wait-for- quorum logic without duplicating the ack table. - Rebootstrap
InProgress - A causal read was requested while the node is re-bootstrapping.
- Replication
Config - Configuration for replication.
- Role
Assignment - Post-handover roles of the two nodes, used to assert that the new primary advertises the new term and the old primary streams as a replica (issue #833 criterion 3).
- SwapDb
- A dataset that stays readable across an atomic re-bootstrap swap.
- Topology
Advertiser - Server-side advertiser. Zero-sized — all state is threaded
through
advertise()’s arguments so callers control the snapshot semantics. - Topology
Auth Gate - Predicate over the caller’s auth context — answers “does this
principal have
cluster:topology:read?”. - Writer
Lease - One snapshot of who owns the writer lease for a database key.
Enums§
- Admission
- Outcome of a write-admission attempt.
- Await
Outcome - Bookmark
Decode Error - Commit
Policy - Failover
Error - Why a coordinated failover could not complete without losing writes.
- Failover
Mode - How a failover should be executed.
- Lease
Error - Node
Role - The replication role a node plays after a failover step.
- Quorum
Error - Errors raised by the quorum coordinator. The write itself succeeded on the primary WAL — these errors signal that replica acknowledgement did not reach quorum and the caller must decide whether to surface the failure or continue anyway.
- Replication
Role - Role of this RedDB instance in a replication cluster.
Constants§
- DEFAULT_
REPLICATION_ TERM - DEFAULT_
REPLICA_ TIMEOUT_ MS - Default replica heartbeat timeout used when an operator hasn’t
configured one explicitly. Matches the order of the
poll_interval_msdefault inReplicationConfig(100 ms) multiplied by a generous fudge factor — five seconds without an ack flips a replica tohealthy: false. Operators tune this viaLagConfig. - DEFAULT_
SLOT_ IDLE_ TIMEOUT_ MS - DEFAULT_
SLOT_ RETENTION_ MAX_ LAG_ LSN - TOPOLOGY_
READ_ CAPABILITY - Capability name from ADR 0008 §1.
Traits§
- Failover
Transport - Cluster mutations and the clock the coordinator drives, injected so the state machine stays pure and deterministically testable.