Skip to main content

Module rollback

Module rollback 

Source
Expand description

Auto-rollback of a deposed primary to the common point (issue #840, PRD #819, ADR 0030).

When a former primary rejoins after a failover still holding writes above the point its log last agreed with the new primary — a divergent tail — it must drop that tail to rejoin a single timeline. The tail is, by definition, non-committed: it sits above the commit watermark (the highest LSN durably replicated to a quorum), so removing it from the live timeline is correct (ADR 0030, NeverRollbackCommitted).

This module is the recover-to-LSN mechanism that does that drop:

  1. Plan & guard the boundary. The recover target is the common point — the LSN up to which the deposed primary’s log still agrees with the new primary (produced by the election, #834). The hard invariant is that the common point is at or above the commit watermark (#822): nothing at or below the watermark is ever rolled back. If the common point is below the watermark, the coordinator refuses to roll back rather than destroy committed data.
  2. Preserve the tail. Read the divergent tail and persist it to a rollback file before anything is removed. Rollback is never silent: if the tail cannot be persisted, the recovery aborts and no data is dropped.
  3. Recover-to-LSN. Roll the live timeline back to the common point over the MVCC history store (ADR 0014), discarding the tail’s versions and restoring the pre-images visible at the common point.
  4. Surface a loud operator event so the discarded writes stay auditable and reconcilable.
  5. Rejoin as a replica of the new primary under the new term.

§Module shape

RollbackCoordinator::run is a pure state machine. The boundary math (RollbackPlan::compute) is separated out so the invariant can be asserted in isolation. Every side effect — reading the tail, writing the rollback file, the MVCC recover-to-LSN, the operator event, the role swap — is injected behind RollbackTransport, so the whole flow runs deterministically against a scripted fake with no engine, disk, clock, or network dependency. Wiring the transport onto the real MVCC history store and the gRPC role-swap belongs to the transport layer once the election (#834) and stale-term fencing (#835) are live; this slice builds and proves the mechanism in isolation.

Structs§

DivergentTail
The divergent tail removed from the live timeline: the records in (common_point_lsn, to_lsn] that never reached quorum.
RollbackCoordinator
The deposed-primary auto-rollback state machine.
RollbackEvent
The loud operator event payload describing a completed rollback, handed to RollbackTransport::emit_rollback_event. Mirrors crate::telemetry::operator_event::OperatorEvent::DeposedPrimaryRollback so the production transport can forward it verbatim while a test transport can capture it.
RollbackOutcome
The result of a completed rejoin.
RollbackPlan
The computed, side-effect-free rollback plan. Splitting this out lets the boundary invariant be asserted without driving any transport.
RollbackRequest
A request to auto-rollback a deposed primary to the common point and rejoin it as a replica.
TailRecord
A single record from the divergent tail that is about to be discarded.

Enums§

RollbackError
Why an auto-rollback could not complete.

Traits§

RollbackTransport
Side effects the rollback coordinator drives, injected so the state machine stays pure and deterministically testable.