Skip to main content

Module move_range

Module move_range 

Source
Expand description

Split-and-move planning and the move-range cutover state machine (issue #1004, PRD #987, ADR 0037).

The WeightedPlacementPlanner decides that a range should move; this module decides how it moves and drives the move safely to completion. It is the glossary’s split-and-move“rebalancing transition that first divides a large or hot shard/range, then moves only the selected subrange to a different writer. Small ranges may move whole without splitting” — riding the glossary’s move range cutover“the old owner continues serving writes while the target first copies a physical checkpoint/snapshot of the range directory, then catches up through the logical range-indexed stream; only after catch-up does the catalog epoch move write authority to the target.”

§Whole-range vs split-and-move

classify_move is the small/large-or-hot decision: a range whose bytes and traffic both sit under the SplitPolicy thresholds moves whole (MoveKind::Whole); a range over either threshold is split first so the move sheds only part of the load (MoveKind::Split). split_range then carves the range at a chosen key into a retained child (the keys the owner keeps) and a moved child (a fresh range id the move hands off), tiling the original keyspace with no gap or overlap.

§The cutover, fenced and gated

MoveRange is the state machine for one move. It encodes the move-range invariant directly:

  1. CopyingSnapshot — the target copies a consistent physical snapshot of the range. Throughout, the catalog still names the old owner, so the old owner keeps serving writes.
  2. CatchingUp — the snapshot is installed at a consistent CommitWatermark; the target replays range-indexed WAL records (issue #992) from that point to close the gap to the live commit watermark, which keeps advancing because the old owner is still writing.
  3. cut_over — only when the target’s applied log covers the live commit watermark does the fenced Handoff transition move the catalog epoch. The epoch bump fences the old owner (its writes now carry a stale epoch and admit_public_write rejects them) and makes the target authoritative. The target accepts no public write until this instant — before it, the target is a replica and the ownership gate rejects it.

§Interrupted moves fail safe

A move can be interrupted at any point — a supervisor restart, a crashed target. recover_interrupted_move resumes from the target’s persisted catch-up position and promotes the target only if it covers the range commit watermark; otherwise it leaves the catalog untouched and the old owner keeps authority. A half-copied target is never promoted, so an interrupted move can lose no committed write.

Everything here is a pure data model over the catalog plus the range-indexed WAL contract — no disk, no clock, no network — so the split arithmetic, the catch-up gate, the fencing, and the interrupted-move safety are all exercised deterministically.

Structs§

MoveRange
One in-flight move-range: the bookkeeping that carries authority for one range from its current owner to a target without losing a write or letting the target serve early.
RangeSplit
The two entries a split_range produces: the child the owner keeps and the child the move will hand off.
SplitPolicy
The thresholds that decide whether a range is small enough to move whole or must be split first.

Enums§

MoveError
Why a move-range step failed. Every variant that can be returned before the fenced handoff leaves the catalog untouched.
MoveKind
How a planned move should be carried out: relocate the whole range, or split it first and move only a subrange.
MovePhase
Where a move-range is in its copy → catch-up → cutover lifecycle.
MoveRecovery
The outcome of recovering an interrupted move.
SplitError
Why a range split could not be planned.
SplitSide
Which child of a split moves to the target.

Functions§

classify_move
Decide whether a range moves whole or is split first, from its live load and the SplitPolicy. A range over the byte ceiling or at/over the hot traffic threshold is split; otherwise it moves whole.
recover_interrupted_move
Resume an interrupted move and decide its fate from the target’s persisted catch-up position alone — the recovery path after a supervisor restart or a crash mid-move.
split_range
Divide range at split_key into a retained child and a moved child, with target enlisted as a replica of the moved child so a later MoveRange can hand authority to it.