Skip to main content

mlua_swarm_server/
operator_ws.rs

1//! # WebSocket Operator Callback IF
2//!
3//! Path for seating an external HTTP/WS caller as an **Operator role** inside
4//! the Engine. One WS connection = one session = three traits co-hosted
5//! (`Operator` / `SeniorBridge` / `SpawnHook`); a single sid is registered into
6//! all three registries simultaneously.
7//!
8//! ## Architecture overview
9//!
10//! ```text
11//! ┌─────────────── External Operator (Human / Agent / other process) ────────┐
12//! │                  WS /v1/operators/:sid/ws  (Bearer required)             │
13//! │     S→C: Ask{req_id,task_id,question}                                   │
14//! │     S→C: HookBefore{req_id,task_id,agent,attempt}                       │
15//! │     S→C: HookAfter{req_id,task_id,agent,attempt,result}  (fire-and-forget)│
16//! │     S→C: Spawn{req_id,task_id,agent,attempt,capability_token}          │
17//! │     C→S: Answer{req_id,value}              (SeniorBridge.ask reply)     │
18//! │     C→S: HookAck{req_id,ok,reason?}        (SpawnHook.before reply)     │
19//! │     C→S: SpawnAck{req_id,value,ok,error?}  (Operator.execute reply)     │
20//! └────────────────────────────────┬────────────────────────────────────────┘
21//!                                  │ axum WebSocket
22//! ┌────────────────────────────────▼────────────────────────────────────────┐
23//! │ login.rs                                                                │
24//! │   operators_ws_connect  — `GET /v1/operators/:sid/ws` upgrade,          │
25//! │                            Bearer token check against minted sid        │
26//! │   handle_operator_socket — write task / read task / disconnect path     │
27//! └────────────────────────────────┬────────────────────────────────────────┘
28//!                                  │
29//! ┌────────────────────────────────▼────────────────────────────────────────┐
30//! │ session.rs : WSOperatorSession                                          │
31//! │   sid + auth_token + tx (Mutex<Option<>>) + pending (Mutex<HashMap>)    │
32//! │   impl SeniorBridge { ask → send Ask + wait Answer }                    │
33//! │   impl SpawnHook    { before → send HookBefore + wait HookAck /         │
34//! │                       after  → send HookAfter fire-and-forget }         │
35//! │   impl Operator     { execute → send Spawn + wait SpawnAck,             │
36//! │                       thin-forward capability_token to MainAI }         │
37//! └────────────────────────────────┬────────────────────────────────────────┘
38//!                                  │ same sid → registered into 3 registries at once
39//!                                  ▼
40//!         engine.senior_bridges / spawn_hooks / operators (SoT)
41//!                                  │ dispatch_attempt → resolve_operator_info
42//!                                  │ looks up session.bridge_id / hook_id / operator_backend_id
43//!                                  ▼
44//!         Ctx.operator (= read by SeniorEscalationMiddleware / MainAIMiddleware /
45//!                           OperatorDelegateMiddleware)
46//!
47//! protocol.rs : ServerMsg / ClientMsg / PendingReply (= wire format + internal reply IR)
48//! ```
49//!
50//! ## Thin-control discipline for Spawn (the Spawn thin-control axis)
51//!
52//! The server sends only `Spawn{capability_token}`; the MainAI (WS Client) forwards the
53//! token to the SubAgent, and the SubAgent hits `/v1/worker/prompt` +
54//! `/v1/worker/result` itself with `Authorization: Bearer <capability_token>`
55//! (= heavy payloads go over HTTP; WS stays purely thin control). See
56//! `protocol::ServerMsg::Spawn` and `mlua_swarm::Operator::execute`
57//! for details.
58//!
59//! ## Design rationale (= for future re-constructors)
60//!
61//! - **3 traits co-hosted**: Holding all 3 faces of the Operator role
62//!   (judgment = `SeniorBridge` / observation = `SpawnHook` / execution =
63//!   `Operator`) in a single session gives 1 WS connection = 1 Operator that
64//!   answers ask/before/after/spawn — the natural shape. Registering the same
65//!   sid into three registries preserves "same Operator" semantics on the
66//!   Registry axis as well.
67//! - **`Mutex<Option<Sender>>` for tx swap-in**: `None` on disconnect,
68//!   `Some(new_tx)` on reconnect. The pending `HashMap` persists on the session
69//!   side, so a client that held answer/ack values during a disconnect can
70//!   reconnect and resend them. (In v1.5, sends during a disconnect fail
71//!   immediately — the client is responsible for remembering its own pending.)
72//! - **req_id naming**: `<sid>-<ask|hb|ha|spawn>-<uuid>` covers both the trait
73//!   axis and uniqueness. Clients can identify the trait from the req_id.
74//! - **`parent_req_id` field**: Schema for representing nesting (e.g. a hook
75//!   firing inside an ask). In v1.5 the engine-side middleware does not fire
76//!   nested calls, so this is always `None`; v2 will re-introduce nesting via
77//!   `task_local`.
78//!
79//! ## Out of scope for v1.5 (carry)
80//!
81//! - Buffering / replay of ask/spawn/hook_before during a disconnect (= sends
82//!   currently just return `Err` on failure).
83//! - Automatic session-TTL cleanup (= session leaks after disconnect wait for
84//!   the admin `DELETE` endpoint).
85//! - True nested ask (= depends on a middleware extension; the `parent_req_id`
86//!   schema is already carried).
87//! - Multi-Blueprint scope separation (= a single WS Operator currently serves
88//!   as the Operator for all tasks).
89//! - `CapToken` consistency between the Operator session and the engine attach session.
90//!
91//! ## REST-like login flow (`login.rs`) — sole Operator session entry point
92//!
93//! `POST/GET/DELETE /v1/operators` + `WS /v1/operators/:sid/ws` (`login.rs`) is
94//! the only Operator session route. The login flow mints the sid server-side,
95//! requires Bearer auth (no empty-string default), and enforces a
96//! roles-exclusivity 409 at mint time. See the `login` module doc for details.
97
98/// REST-like Operator session resource (`POST/GET/DELETE /v1/operators` + WS upgrade).
99pub mod login;
100/// Wire format (`ServerMsg` / `ClientMsg`) for `WS /v1/operators/:sid/ws`.
101pub mod protocol;
102/// `WSOperatorSession`: the 3-trait (`SeniorBridge`/`SpawnHook`/`Operator`) WS session object.
103pub mod session;
104
105pub use login::{
106    operators_create, operators_delete, operators_info, operators_ws_connect, OperatorSessionEntry,
107    OperatorsCreateReq, OperatorsCreateResp, OperatorsInfoResp,
108};
109pub use protocol::{ClientMsg, ServerMsg};
110pub use session::WSOperatorSession;