1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
//! # Migration
//!
//! This module defines how compaction of segments interacts with the [`StorageStrategy`], enabling
//! the strategy implementor to collect metadata about segment contents and to lazily rewrite key
//! suffixes and values during compaction. It also defines the types that make up the migration API
//! surface.
//!
//! ## Motivation
//!
//! The KV layer is deliberately agnostic to the semantics of keys and values. It does not know
//! about HLC timestamps, schema versions, or any other engine-level concern. However, the engine
//! built on top of the KV layer has requirements that depend on understanding what is inside
//! segments:
//!
//! - **Schema version eviction.** TQL types are versioned. When a type's schema changes, old rows
//! encoded under the previous schema version remain on disk until they are rewritten. A schema
//! version is only safe to evict from the catalog once no segment contains a row encoded under
//! it. To determine this, the engine needs to know the oldest HLC timestamp present across all
//! live segments - the HLC watermark. Schema versions whose validity interval ends before the
//! watermark can be safely dropped.
//!
//! - **Lazy value migration.** Rewriting all rows eagerly on a schema change would be
//! prohibitively expensive. Instead, rows are migrated lazily: during compaction, each row is
//! brought up to the current schema version before being written to the output segment. Over
//! time, as compaction covers all key ranges, the HLC watermark advances and old schema versions
//! become evictable. A forced full compaction pass can be triggered by the operator to
//! deliberately advance the watermark.
//!
//! - **Key suffix rewriting.** Each key consists of a prefix (the logical row identity) and a
//! suffix (typically an HLC timestamp encoding the row's version). When a row is migrated to a
//! newer schema version, its suffix should be updated to reflect the HLC at migration time. This
//! is safe during compaction because by the time `migrate` is called, the merging iterator has
//! already deduplicated entries: exactly one entry per logical key (prefix) is passed to
//! `migrate`. Since no other entry in the output shares that prefix, the suffix can be freely
//! replaced without violating the segment's physical sort order, which is determined by prefix
//! alone.
//!
//! ## API Design
//!
//! The migration API is expressed as extensions to [`StorageStrategy`]:
//!
//! ### Segment summaries
//!
//! Each segment has an associated [`Summary`] that is built incrementally inside `migrate` during
//! flush and compaction. The summary is stored in the manifest alongside the segment and survives
//! restarts, so the engine can reconstruct the HLC watermark on startup without re-scanning
//! segment files.
//!
//! `SegmentId` is an internal manifest concern only - it correlates summaries to files on disk and
//! handles deduplication during compaction. The engine never sees it. Instead, the manifest
//! exposes:
//!
//! ```text
//! fn summaries() -> &[Summary]
//! ```
//!
//! reflecting the current set of live segment summaries. The engine folds over this slice whenever
//! it needs the watermark:
//!
//! ```text
//! watermark = min(summary.min_hlc for all summaries)
//! ```
//!
//! When segments are removed (merged during compaction, no snapshots or pending reads hold them),
//! the manifest updates its live set. The engine recomputes the watermark from `summaries()` on
//! demand - no deletion events or per-segment bookkeeping required.
//!
//! ### Migration context
//!
//! Compaction is driven by a [`MigrationContext`] produced by a [`MigrationContextSource`] at the
//! start of each compaction job via `::context()`. The context holds engine-level state -
//! typically an `Rc` snapshot of the current catalog and the current HLC - and is passed to every
//! `migrate` call for the duration of that compaction. The KV layer never inspects the context's
//! contents; it only passes it through.
//!
//! The context source may be updated between compaction ticks to advance the HLC and pick up
//! catalog changes, allowing long-running compactions to migrate rows to progressively newer
//! schema versions without stalling.
//!
//! ### The `migrate` function
//!
//! ```text
//! fn migrate(
//! ctx: &mut MigrationContext, summary: &mut Summary,
//! prefix: &[u8], suffix: &[u8], value: &[u8]
//! ) -> Migrate
//! ```
//!
//! Called once per logical key (post-deduplication, post-filter) during compaction. The
//! implementor inspects the entry, updates `summary` to reflect the output entry, and returns
//! one of:
//!
//! - `Migrate::Unchanged` - write the entry as-is. The common case; no allocation.
//! - `Migrate::Modified { value, suffix }` - replace the value and/or suffix. Either field may be
//! `None` to indicate "keep existing." The KV layer splices the new suffix onto the existing
//! prefix when writing the output entry.
//!
//! The summary is updated inside `migrate` rather than in a separate `observe` pass. This keeps
//! the implementor in full control - they know exactly what was output and can update the summary
//! accordingly in one pass. Strategies that only need summaries and no rewriting can return
//! `Migrate::Unchanged` from the default implementation while still maintaining their summary.