1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
//! OBSERVE + VERIFY interpretation of a session's pane state (§3.4 phases 4–5).
//!
//! Why: phases 4 (OBSERVE) and 5 (VERIFY) read a session's pane/record state and
//! decide (a) its current verification [`SessionTaskState`] and (b) whether the
//! pane carries gate-satisfying EVIDENCE (a PR URL, captured test output, a
//! diff/write confirmation — §3.5). The spec says the SM can interpret raw panes
//! WITHOUT provider inference ("the session-manager-driver skill's inference
//! applies"), so this interpretation is DETERMINISTIC heuristics over the session
//! JSON — no LLM call, fully unit-testable. Keeping it here (separate from the
//! orchestrator) keeps each file under the SLOC cap and makes the gate logic
//! auditable in one place.
//! What: [`ObservedState`] (the interpreted state + optional captured evidence),
//! [`interpret_session`] (session JSON → `ObservedState`), and the evidence
//! scanner [`scan_evidence`]. The orchestrator turns an [`ObservedState`] into a
//! goal-store [`SessionUpdate`](crate::core::sm::SessionUpdate).
//! Test: `observe_tests.rs` covers running/failed/verified interpretation and
//! evidence extraction (PR URL, test-pass output) vs. no-evidence.
use Value;
use crateSessionTaskState;
/// The interpreted outcome of observing one session (§3.4 phases 4–5).
///
/// Why: OBSERVE/VERIFY need to convey BOTH the interpreted verification state and
/// any captured evidence in one value so the orchestrator can build a single
/// goal-store update. Crucially, `Verified` is only ever reached WITH evidence —
/// the verification gate (§3.5) — so this type couples them: a `Verified` state
/// always carries `Some(evidence)`.
/// What: `state` is the interpreted [`SessionTaskState`]; `evidence` is the
/// gate-satisfying snippet (PR URL / test output / diff) when one was observed,
/// else `None`.
/// Test: `observe_tests.rs`.
/// Interpret a session's record/pane JSON into an [`ObservedState`] (deterministic).
///
/// Why: phase 4 turns the raw session JSON (the `SessionControl::get` body) into a
/// verification state WITHOUT an LLM (§3.4 — the pane heuristic applies). The
/// interpretation is conservative: a session is only `Verified` when the pane
/// carries observed EVIDENCE (§3.5), never merely because it reports "done".
/// What: reads the nested `session` object (the control surface wraps the record
/// as `{ "session": { … } }`); FIRST scans the RAW pane/output string value for
/// evidence (so JSON escaping never garbles a captured PR URL or test line), and
/// only falls back to scanning the whole compact JSON when no pane/output field is
/// present. If evidence is present → `Verified` + that evidence. Else if the record
/// `state` indicates a terminal failure (`errored`/`dead`/`failed`) → `Failed`.
/// Else if it indicates the session is gone/stopped with no evidence, or still
/// active → `Running` (in flight, not yet verified). A brand-new record with no
/// activity stays `Launched`.
/// Test: `observe_tests.rs::interpret_running`, `interpret_failed`,
/// `interpret_verified_with_pr_url`, `interpret_no_evidence_stays_unverified`,
/// `interpret_evidence_from_raw_pane`.
/// The ordered field names that may hold a session's raw pane / captured output.
///
/// Why: different control surfaces (the tmux-backed session manager, the test mock)
/// name the captured pane text differently; checking a small ordered set finds the
/// raw string regardless of which the surface used, most-specific first.
const PANE_FIELDS: & = &;
/// Extract the RAW pane/output string from a session record, if present.
///
/// Why: the evidence scanner must run over the UNESCAPED pane text, not the
/// compact JSON serialization (which adds `\"`/`}}` framing and escapes newlines,
/// garbling a captured PR URL or test line — issue #1311 review). Pulling the raw
/// string value out lets the scanner see exactly what the session printed.
/// What: returns the first present, non-empty string value among [`PANE_FIELDS`]
/// on the record; `None` when the record carries no recognised pane field (the
/// caller then falls back to the whole compact JSON so evidence elsewhere in the
/// payload is still found).
/// Test: `observe_tests.rs::interpret_evidence_from_raw_pane`,
/// `scan_pr_url_strips_json_framing` (the fallback path stays green).
/// Whether a record `state` string indicates a terminal FAILURE.
///
/// Why: a session that errored/died is a `Failed` task — the goal cannot close on
/// it, and the operator must be told. Centralising the terminal-failure vocabulary
/// keeps `interpret_session` readable.
/// What: returns `true` for `errored`/`dead`/`failed`/`killed` (case already
/// lowered by the caller).
/// Test: `observe_tests.rs::interpret_failed`.
/// Scan observed pane/record text for gate-satisfying evidence (§3.5).
///
/// Why: the verification gate forbids `Verified` without OBSERVED evidence (a PR
/// URL, a captured test-pass count, a diff/write confirmation). A deterministic
/// scanner means a test can pin exactly what counts as evidence, and the SM can
/// never "claim done" without it.
/// What: returns the FIRST matching evidence snippet found, in priority order:
/// (1) a GitHub/GitLab PR/MR URL; (2) a test-pass summary (`N passed`, `N tests
/// passed`, `test result: ok`); (3) a diff/write confirmation marker. Returns
/// `None` when no evidence pattern matches.
/// Test: `observe_tests.rs::scan_finds_pr_url`, `scan_finds_test_pass`,
/// `scan_finds_diff`, `scan_finds_nothing`.
/// Find a GitHub/GitLab pull/merge-request URL in `text`.
///
/// Why: "PR opened" evidence is a printed PR URL (§3.5). A token scan keeps this
/// dependency-free (no regex crate) and predictable. Because evidence is scanned
/// over the JSON-serialized session payload, a URL at the END of a JSON string
/// value is followed by `"`/`}`/`]` framing punctuation; the trim must strip ALL
/// of that so the captured URL is clean (not e.g. `…/pull/9"}}`).
/// What: finds the first `http`-scheme span (scanning char-by-char so it works
/// even when the URL has NO surrounding whitespace — e.g. embedded in compact
/// JSON like `"pane":"…/pull/9","state":…`), bounding the URL at the first
/// non-URL character (whitespace OR JSON/sentence framing: quotes, braces,
/// brackets, parens, comma, trailing dot), then returns it iff it carries a
/// `github.com /pull/` or `gitlab /merge_requests/` PR path.
/// Test: `observe_tests.rs::scan_finds_pr_url`, `scan_pr_url_strips_json_framing`.
/// Find a test-pass summary in `text`.
///
/// Why: "tests pass" evidence is a captured run with a pass count (§3.5).
/// What: scans for the cargo `test result: ok.` marker (returning the line) or a
/// `N passed` / `N tests passed` phrase; returns the matched snippet.
/// Test: `observe_tests.rs::scan_finds_test_pass`.
/// Find a diff / file-write confirmation marker in `text`.
///
/// Why: "edit made" evidence is a diff or write confirmation in the pane (§3.5).
/// What: returns the first line containing a unified-diff header (`diff --git`),
/// or a write-confirmation phrase (`wrote `/`updated `/`created file`); else
/// `None`.
/// Test: `observe_tests.rs::scan_finds_diff`.