moonpool_sim/chaos/mod.rs
1//! Chaos testing infrastructure for deterministic fault injection.
2//!
3//! This module implements FoundationDB's buggify approach for finding bugs
4//! through comprehensive chaos testing.
5//!
6//! # Philosophy
7//!
8//! FoundationDB's insight: **bugs hide in error paths**. Production code
9//! rarely exercises timeout handlers, retry logic, or failure recovery.
10//! Chaos testing finds these bugs before production does.
11//!
12//! Key principles:
13//! - **Deterministic**: Same seed produces same faults for reproducible debugging
14//! - **Comprehensive**: Test all failure modes (network, timing, corruption)
15//! - **Low probability**: Faults rare enough for progress, frequent enough to find bugs
16//!
17//! # Components
18//!
19//! | Component | Purpose |
20//! |-----------|---------|
21//! | [`buggify!`] | Probabilistic fault injection at code locations |
22//! | [`always_assert!`] | Invariants that must never fail |
23//! | [`sometimes_assert!`] | Behaviors that should occur under chaos |
24//! | [`InvariantCheck`] | Cross-actor properties validated after events |
25//!
26//! # The Buggify System
27//!
28//! Each buggify location is randomly **activated** once per simulation run,
29//! then fires probabilistically on each call.
30//!
31//! ```ignore
32//! // 25% probability when activated
33//! if buggify!() {
34//! return Err(SimulatedFailure);
35//! }
36//!
37//! // Custom probability
38//! if buggify_with_prob!(0.02) {
39//! corrupt_data();
40//! }
41//! ```
42//!
43//! ## Two-Phase Activation
44//!
45//! 1. **Activation** (once per location per seed): `random() < activation_prob`
46//! 2. **Firing** (each call): If active, `random() < firing_prob`
47//!
48//! This ensures consistent behavior within a run while varying which
49//! locations are active across different seeds.
50//!
51//! | Parameter | Default | FDB Reference |
52//! |-----------|---------|---------------|
53//! | `activation_prob` | 25% | `Buggify.h:79-88` |
54//! | `firing_prob` | 25% | `P_GENERAL_BUGGIFIED_SECTION_FIRES` |
55//!
56//! # Fault Injection Mechanisms
57//!
58//! ## Network Faults
59//!
60//! | Mechanism | Default | What it tests |
61//! |-----------|---------|---------------|
62//! | Random connection close | 0.001% | Reconnection, message redelivery |
63//! | Bit flip corruption | 0.01% | CRC32C checksum validation |
64//! | Connect failure | 50% probabilistic | Timeout handling, retries |
65//! | Partial/short writes | 1000 bytes max | Message fragmentation |
66//! | Packet loss | disabled | At-least-once delivery |
67//! | Network partitions | disabled | Split-brain handling |
68//! | Half-open connections | manual | Peer crash detection |
69//!
70//! ## Timing Faults
71//!
72//! | Mechanism | Default | What it tests |
73//! |-----------|---------|---------------|
74//! | TCP operation latencies | 1-11ms connect | Async scheduling |
75//! | Clock drift | 100ms max | Leases, heartbeats, leader election |
76//! | Buggified delays | 25% probability | Race conditions |
77//! | Per-connection asymmetric delays | optional | Satellite links, geographic distance |
78//!
79//! # Assertions
80//!
81//! ## always_assert!
82//!
83//! Guards invariants that must **never** fail:
84//!
85//! ```ignore
86//! always_assert!(
87//! sent_count >= received_count,
88//! "message_ordering",
89//! "received more than sent: {} > {}", received_count, sent_count
90//! );
91//! ```
92//!
93//! ## sometimes_assert!
94//!
95//! Validates that error paths **do** execute under chaos:
96//!
97//! ```ignore
98//! if buggify!() {
99//! sometimes_assert!("timeout_triggered");
100//! return Err(Timeout);
101//! }
102//! ```
103//!
104//! Multi-seed testing with `UntilAllSometimesReached(1000)` ensures all
105//! `sometimes_assert!` statements fire across the seed space.
106//!
107//! # Strategic Placement
108//!
109//! Place `buggify!()` calls at:
110//! - Error handling paths
111//! - Timeout boundaries
112//! - Retry logic entry points
113//! - Resource limit checks
114//! - State transitions
115//!
116//! # Configuration
117//!
118//! ```ignore
119//! use moonpool_sim::{ChaosConfiguration, NetworkConfiguration};
120//!
121//! // Full chaos (default)
122//! let chaos = ChaosConfiguration::default();
123//!
124//! // No chaos (fast tests)
125//! let chaos = ChaosConfiguration::disabled();
126//!
127//! // Randomized per seed
128//! let chaos = ChaosConfiguration::random_for_seed();
129//! ```
130
131pub mod assertions;
132pub mod buggify;
133pub mod invariants;
134pub mod state_registry;
135
136// Re-export main types at module level
137pub use assertions::{
138 AssertionStats, get_assertion_results, panic_on_assertion_violations, record_assertion,
139 reset_assertion_results, validate_assertion_contracts,
140};
141pub use buggify::{buggify_init, buggify_internal, buggify_reset};
142pub use invariants::InvariantCheck;
143pub use state_registry::StateRegistry;