1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
//! Reference distributed [`DirectiveStore`] backed by etcd v3.
//!
//! A fleet of proxy instances must all see the *same* diagnostics directives, and
//! a control-plane flip must reach every instance with **no restart** (`docs/05`
//! ยง3, NFR-T3, ADR-013). This adapter realizes that over etcd's watch API using
//! the **watch-and-cache** model: a background task subscribes to one etcd key and
//! keeps a locally-cached [`DirectiveSet`] snapshot fresh, so [`DirectiveStore::load`]
//! on the request hot path is a cheap `Arc` clone, never per-request network I/O.
//!
//! It deliberately backs **only** the directive (observability) control plane.
//! The migration/placement store (`osproxy-control::MigrationStore`) needs a
//! linearizable compare-and-swap and a fallible, async seam; wiring it over etcd
//! is a separate step gated on that seam refactor.
//!
//! Posture:
//! - **Fail-fast at startup**: [`EtcdDirectiveStore::connect`] does an initial
//! read, so an unreachable/misconfigured etcd is a loud construction error, not
//! a silent empty directive set.
//! - **Fail-safe while running**: a transient etcd outage or a *malformed* publish
//! keeps the **last good** snapshot rather than blanking diagnostics; the watch
//! task reconnects with a bounded delay.
//! - **One fail-closed decoder**: directives are decoded with
//! [`osproxy_observe::decode_directive_set`], the same decoder the admin
//! `POST /admin/directives` endpoint uses, so a directive means the same thing
//! however it is published, and a typo'd key can never widen its blast radius.
use Arc;
use ArcSwap;
use Clock;
use ;
/// Errors constructing the store. Only startup is fallible to the caller; once
/// running, the watch task absorbs transient failures (keeping the last snapshot).
/// A [`DirectiveStore`] whose snapshot is kept fresh by an etcd watch.
///
/// Construct with [`EtcdDirectiveStore::connect`] inside a Tokio runtime; it loads
/// the initial set and spawns the background watch. Clone is cheap (shared
/// snapshot) so the same store can be handed to the pipeline and an admin surface.
/// Swaps in a freshly decoded set, or **keeps the last good snapshot** if the
/// value does not parse, a malformed publish must never blank fleet diagnostics.