1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
// Copyright (c) 2026 Hamze Ghalebi. All rights reserved.
// Licensed under the Rustlift Non-Commercial Licence v1.0.
//! Exponential-backoff retry wrapper for transient cloud operations.
//!
//! Cloud APIs are inherently unreliable: Azure throttles requests,
//! DNS resolution hiccups, TLS sessions time out. This module provides
//! a single function — [`reliable_op`] — that wraps any async operation
//! with automatic retry logic.
//!
//! # Design Rationale
//!
//! Without centralised retry logic, each call site in the pipeline would
//! need its own loop, sleep, and error classification. This creates
//! duplication and inconsistency. By routing every fallible operation
//! through `reliable_op`, the retry *policy* is defined once and the
//! error *classification* lives in one `match` block.
//!
//! # Retry Policy
//!
//! | Parameter | Value | Reasoning |
//! |--------------------|---------|----------------------------------------------|
//! | Max elapsed time | 5 min | Enough for Azure cold-starts and DNS propagation |
//! | Max interval | 30 s | Prevents overwhelming the API during outages |
//! | Multiplier | 1.5× | Gentler growth than the default 2× |
//!
//! # Learning: Higher-Order Functions in Rust
//!
//! `reliable_op` is a **higher-order function** — it accepts a *closure*
//! as an argument. The closure is called on every retry attempt. In Rust,
//! closures are expressed through the `Fn`, `FnMut`, and `FnOnce` traits.
//!
//! Here, the closure must implement `Fn()` (callable multiple times),
//! because the retry loop may call it repeatedly. If it were `FnOnce`,
//! only one attempt would be possible.
use retry;
use ExponentialBackoff;
use Duration;
use crateDeployError;
/// Executes an async closure with exponential backoff, classifying errors
/// as fatal (abort) or transient (retry) based on the [`DeployError`]
/// variant.
///
/// # Arguments
///
/// * `op_name` — A human-readable label emitted in retry and failure log
/// lines (e.g. `"Auth Handshake"`, `"CLI ZipDeploy"`).
/// * `f` — A closure that returns a `Future<Output = Result<T, DeployError>>`.
/// The closure is called on *every* attempt; it **must not** capture
/// mutable state that would violate the retry contract.
///
/// # Error Classification
///
/// The classification lives in the `match` block inside this function:
///
/// - **Fatal** (`Config`, `Dependency`, `Build`, `PathEncoding`):
/// User mistakes — retrying cannot fix them.
/// - **Transient** (everything else):
/// External systems — they often self-heal.
///
/// # Errors
///
/// Returns the **last** [`DeployError`] if:
/// - A fatal variant is encountered (immediate abort), or
/// - The operation never succeeds within the 5-minute window.
///
/// # Examples
///
/// ```
/// use rustlift::resilience::reliable_op;
/// use rustlift::errors::DeployError;
/// use std::sync::{
/// atomic::{AtomicUsize, Ordering},
/// Arc,
/// };
///
/// let attempts = Arc::new(AtomicUsize::new(0));
/// let attempts_for_retry = Arc::clone(&attempts);
///
/// let output = tokio::runtime::Runtime::new()
/// .unwrap()
/// .block_on(async move {
/// reliable_op("Transient Demo", || {
/// let attempts_for_try = Arc::clone(&attempts_for_retry);
/// async move {
/// let seen = attempts_for_try.fetch_add(1, Ordering::SeqCst);
/// if seen == 0 {
/// Err(DeployError::Infra("temporary failure".into()))
/// } else {
/// Ok("ok")
/// }
/// }
/// })
/// .await
/// })
/// .unwrap();
///
/// assert_eq!(output, "ok");
/// assert_eq!(attempts.load(Ordering::SeqCst), 2);
/// ```
///
/// # Panics
///
/// This function does not panic.
///
/// # Safety
///
/// Safe to call. It uses safe async primitives and does not require unsafe
/// caller guarantees.
///
/// # Learning: Generic Bounds
///
/// The signature `<F, Fut, T>` uses three generic parameters:
///
/// - `F: Fn() -> Fut` — the closure type.
/// - `Fut: Future<Output = Result<T, DeployError>>` — the future it returns.
/// - `T` — the success value.
///
/// This pattern is called **static dispatch**: the compiler generates
/// specialised code for each concrete `F`, so there is **zero runtime
/// overhead** compared to passing a function pointer or trait object.
pub async
// ---------------------------------------------------------------------------
// Tests
// ---------------------------------------------------------------------------