Skip to main content

Module replication

Module replication 

Source
Expand description

Replication Module

Implements single-primary, multi-replica replication via WAL streaming.

§Architecture

  • Primary: accepts writes and streams WAL records to replicas
  • Replica: read-only, connects to primary for WAL streaming
  • Initial sync via snapshot transfer, then incremental WAL

§Usage

// Primary
let options = RedDBOptions::persistent("./primary-data")
    .with_replication(ReplicationConfig::primary());

// Replica
let options = RedDBOptions::persistent("./replica-data")
    .with_replication(ReplicationConfig::replica("http://primary:50051"));

Modules§

bookmark
Causal bookmark token helpers.
cascade
Cascading replication for async read-replicas (issue #838, PRD #819).
cdc
Change Data Capture (CDC) — stream of database change events.
commit_policy
Primary commit policies (PLAN.md Phase 11.4).
commit_waiter
Synchronous commit waiter (PLAN.md Phase 11.4 — ack_n).
election
Term-based, quorum-gated automatic election (issue #834, PRD #819, ADR 0030).
failover
Coordinated zero-RPO failover (issue #833, PRD #819).
fence
Stale-term fencing for a returning ex-primary (issue #835, PRD #819, ADR 0030).
flow_control
Write-admission flow control keyed on in-quorum replica lag (issue #826).
lease
Serverless writer lease (PLAN.md Phase 5 / W6).
logical
Logical replication helpers shared by replica apply and point-in-time restore.
primary
Primary-side replication: WAL record production and snapshot serving.
quorum
Quorum-based commit coordination (Phase 2.6 multi-region PG parity).
replica
Replica-side replication: connects to primary, consumes WAL records.
rollback
Auto-rollback of a deposed primary to the common point (issue #840, PRD #819, ADR 0030).
scheduler
Backup Scheduler — automatic periodic snapshots with optional remote upload.
swap_db
Stay-readable re-bootstrap with an atomic dataset swap (issue #837, PRD #819).
topology_advertiser
Server-side TopologyAdvertiser (issue #167).
witness
Witness runtime profile (issue #836, PRD #819, ADR 0030).

Structs§

CascadeRelay
Tracks the sub-replica slots an intermediate holds and the frontiers that must propagate through the chain. Pure bookkeeping — the forwarding transport calls into it to decide what to send and what to advertise upstream.
CascadeUpstream
An intermediate replica a sub-replica may cascade from.
CausalBookmark
CommitWaiter
DivergentTail
The divergent tail removed from the live timeline: the records in (common_point_lsn, to_lsn] that never reached quorum.
DownstreamSlot
A sub-replica slot held by an intermediate.
ElectionCoordinator
The quorum-gated election state machine.
ElectionRequest
A request to run an election on behalf of candidate.
FailoverCoordinator
The coordinated zero-RPO failover state machine.
FailoverNode
A node participating in a failover.
FailoverOutcome
The result of a completed handover.
FailoverRequest
A request to hand the primary role from old_primary to target.
FileLastVoteStore
File-backed last-vote store. Persists the record alongside the node’s other durable replication state. The write is atomic (temp file + rename) so a crash mid-write never yields a torn record — either the old vote or the new one survives, never a half of each.
FileTermStore
File-backed term store. Persisted with the atomic temp-file + rename + parent-dir fsync discipline used for the durable last-vote (super::FileLastVoteStore) so a crash mid-write never yields a torn record and an adopted term cannot be silently lost.
FlowController
Ticket-based write-admission flow controller.
LagConfig
Knobs for the lag/health computation. Kept as a small struct so the call sites (gRPC topology RPC, RedWire HelloAck builder) thread the same defaults without each one redeclaring constants.
LastVote
A node’s durable voting record: the highest term it has participated in and who, if anyone, it granted that term. Persisted so a restart cannot erase the fact that a vote was already cast (requirement 2).
LeaseStore
Wraps an AtomicRemoteBackend with lease primitives. The lease object is stored under a deterministic key derived from database_key; the store reads/writes that one key.
Member
A cluster member as seen by the supervisor’s membership view.
MemoryLastVoteStore
In-memory last-vote store for tests and witnesses that do not need cross-restart durability. (A witness should still persist in production; the file store is used there.)
MemoryTermStore
In-memory term store for tests and ephemeral nodes.
QuorumConfig
Quorum configuration stored alongside ReplicationConfig.
QuorumCoordinator
Tracks per-replica region bindings and pairs them with the primary’s ack map. PrimaryReplication owns the WAL buffer + ReplicaState list; this coordinator adds the region dimension and the wait-for- quorum logic without duplicating the ack table.
RebootstrapInProgress
A causal read was requested while the node is re-bootstrapping.
ReplicationConfig
Configuration for replication.
RoleAssignment
Post-handover roles of the two nodes, used to assert that the new primary advertises the new term and the old primary streams as a replica (issue #833 criterion 3).
RollbackCoordinator
The deposed-primary auto-rollback state machine.
RollbackEvent
The loud operator event payload describing a completed rollback, handed to RollbackTransport::emit_rollback_event. Mirrors crate::telemetry::operator_event::OperatorEvent::DeposedPrimaryRollback so the production transport can forward it verbatim while a test transport can capture it.
RollbackOutcome
The result of a completed rejoin.
RollbackPlan
The computed, side-effect-free rollback plan. Splitting this out lets the boundary invariant be asserted without driving any transport.
RollbackRequest
A request to auto-rollback a deposed primary to the common point and rejoin it as a replica.
StaleTermFenced
Why the term fence refused a message: the incoming term is behind the current term, so the sender is a deposed primary on a superseded timeline.
SwapDb
A dataset that stays readable across an atomic re-bootstrap swap.
TailRecord
A single record from the divergent tail that is about to be discarded.
TermFence
The stale-term fence. Wraps a durable TermStore and applies the term rule at the apply and handshake boundaries.
TopologyAdvertiser
Server-side advertiser. Zero-sized — all state is threaded through advertise()’s arguments so callers control the snapshot semantics.
TopologyAuthGate
Predicate over the caller’s auth context — answers “does this principal have cluster:topology:read?”.
VoteRequest
A request for a vote, sent by a candidate to a voter.
Voter
A voting member. Wraps the durable LastVoteStore and applies the vote rule. The voter is the seat of correctness: the watermark rule and the durable double-vote guard both live here.
WitnessSupervisor
A booted witness node: the control-plane supervisor with no data plane.
WriterLease
One snapshot of who owns the writer lease for a database key.

Enums§

Admission
Outcome of a write-admission attempt.
AwaitOutcome
BookmarkDecodeError
CascadeRefusal
Why a requested cascade source was refused and the node fell back to the primary. Surfaced (not swallowed) so a misconfiguration is observable rather than a silent performance cliff.
CommitPolicy
ElectionOutcome
The result of an election attempt.
FailoverError
Why a coordinated failover could not complete without losing writes.
FailoverMode
How a failover should be executed.
FenceBoundary
The boundary at which a term-stamped message is being admitted. Only affects diagnostics — the term rule is identical at both.
FenceVerdict
The verdict of the term fence for one incoming term-stamped message.
LastVoteError
LeaseError
MemberKind
Whether a member holds data (and can therefore be promoted to primary) or is a vote-only witness (ADR 0030 — “a node that runs only the supervisor module”).
NodeRole
The replication role a node plays after a failover step.
QuorumError
Errors raised by the quorum coordinator. The write itself succeeded on the primary WAL — these errors signal that replica acknowledgement did not reach quorum and the caller must decide whether to surface the failure or continue anyway.
RefusalReason
Why a voter refused a candidate.
ReplicaClass
How a node chooses its WAL upstream.
ReplicationRole
Role of this RedDB instance in a replication cluster.
RollbackError
Why an auto-rollback could not complete.
RuntimeProfile
Which planes a node boots.
TermStoreError
Error reading or persisting the durable current term.
UpstreamChoice
Where a node should open its WAL stream.
VoteDecision
The outcome of a voter considering a VoteRequest.
VotingState
Whether a member currently participates in voting.

Constants§

DEFAULT_REPLICATION_TERM
DEFAULT_REPLICA_TIMEOUT_MS
Default replica heartbeat timeout used when an operator hasn’t configured one explicitly. Matches the order of the poll_interval_ms default in ReplicationConfig (100 ms) multiplied by a generous fudge factor — five seconds without an ack flips a replica to healthy: false. Operators tune this via LagConfig.
DEFAULT_SLOT_IDLE_TIMEOUT_MS
DEFAULT_SLOT_RETENTION_MAX_LAG_LSN
TOPOLOGY_READ_CAPABILITY
Capability name from ADR 0008 §1.

Traits§

ElectionTransport
Cluster operations the candidate drives, injected so the state machine stays pure and deterministically testable. Production backs these onto the membership view, the per-peer vote RPC, the durable term store, and the FAILOVER handover; tests back them onto a scripted fake.
FailoverTransport
Cluster mutations and the clock the coordinator drives, injected so the state machine stays pure and deterministically testable.
LastVoteStore
Durable store for a node’s last vote. The contract is narrow on purpose: load returns the persisted record (or the default term 0, voted_for None when nothing was ever written), and persist makes a record durable before the caller acknowledges a grant.
RollbackTransport
Side effects the rollback coordinator drives, injected so the state machine stays pure and deterministically testable.
TermStore
Durable store for a node’s current replication term. The default (when nothing was ever written) is DEFAULT_REPLICATION_TERM, matching the term records carry before any failover.

Functions§

plan_upstream
Decide where a node streams from, given its streaming class and an optionally-requested intermediate source.
quorum_threshold
Quorum threshold for a set of members: a strict majority of the voting members. Witnesses count; catching-up replicas do not.
randomized_election_timeout
A randomized election timeout in [base, base + jitter).