1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
// SPDX-License-Identifier: Apache-2.0
//! Deterministic fault-injection points for crash-recovery tests.
//!
//! W2b shipped the rollback machinery — atomic mapping persistence,
//! mirror Drop guard, HEAD/index restore on failure — but until we
//! actually crash the process between the load-bearing writes, the
//! rollback paths only have unit-test coverage of the *helpers*, not
//! of the *recovery contract itself*. The integration story
//! ("crashing here doesn't corrupt the bridge mapping") was unverified.
//!
//! This module exposes a single `maybe_panic_at(name)` checkpoint
//! that production code threads at the points where a crash would
//! exercise a recovery path. Tests opt in by setting the
//! `HEDDLE_FAULT_INJECT` environment variable to a comma-separated
//! list of checkpoint names — e.g.
//! `HEDDLE_FAULT_INJECT=mapping_after_tmp_before_commit` — and the
//! next process to hit that checkpoint panics with a stable message.
//!
//! The next CLI invocation (a separate process, no inherited env)
//! must recover cleanly. That's the contract under test.
//!
//! ## Why an env var instead of a build-time `#[cfg(test)]` gate
//!
//! The crash points sit in `objects` and `cli` paths that get spawned
//! as separate child processes during integration tests. A child
//! process can't see the parent test's `cfg(test)` flag, but it does
//! inherit env vars by default. An env var lets the parent test set
//! the crash point, spawn the child, observe the child crash, then
//! spawn a fresh child (without the env var) and verify recovery.
//!
//! ## Performance
//!
//! `maybe_panic_at` is a single env lookup + string split + linear
//! search. The env var is read once on first call and cached. With no
//! `HEDDLE_FAULT_INJECT` set (the production default), the cached
//! `None` short-circuits in well under a microsecond.
use OnceLock;
/// Cached parse of the `HEDDLE_FAULT_INJECT` env var. `None` means
/// the env var was not set; an empty `Vec` means it was set to an
/// empty string (treated as no checkpoints active).
static FAULT_POINTS: = new;
/// Crash the current process if `name` is listed in `HEDDLE_FAULT_INJECT`.
///
/// Production callers thread this at points where a crash would
/// exercise a recovery path. Tests set the env var on a child
/// process to deterministically trigger the crash, then verify the
/// next clean process recovers.
///
/// The panic message includes the checkpoint name so test logs can
/// distinguish an intentional fault from a real bug.
/// Like [`maybe_panic_at`], but returns an `io::Error` instead of
/// panicking — for exercising *in-process* error-recovery paths (a
/// graceful failure that drives a rollback) rather than crash recovery.
///
/// Production callers thread this where a returned error must unwind a
/// partially-applied operation; tests opt in by listing the checkpoint
/// name in `HEDDLE_FAULT_INJECT` and assert the rollback left no
/// partial state. With the env var unset the cached `None`
/// short-circuits, exactly like [`maybe_panic_at`].
/// Test-only helper: clear the cached env-var read so a single
/// process can re-parse `HEDDLE_FAULT_INJECT` between phases. Not
/// for production use — the cache is what makes the production
/// hot-path free.