studio-worker 0.4.7

Pull-based image-generation worker for the minis.gg studio.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
# studio-worker: architecture overview

`studio-worker` is a single self-contained Rust binary that pulls
**image**, **LLM**, **audio (STT/TTS)**, and **video** generation
jobs from the [minis.gg studio](https://studio.minis.gg), runs them
locally, and posts the results back.  It's deliberately one process:
no helper daemons, no shared secrets, no out-of-band setup.  An
operator clicks Approve in the studio dashboard once per machine,
and the worker takes over from there.

This page is the canonical "how does the whole thing work" reference.
For install / register / day-one instructions see the top-level
[README](../../README.md); for plans-in-flight see
[`plans/`](../../plans).

## Table of contents

1. [Two-binary big picture]#two-binary-big-picture
2. [Process lifecycle]#process-lifecycle
3. [Source-tree map]#source-tree-map
4. [Registration (auto-register-with-approval)]#registration-auto-register-with-approval
5. [The WebSocket session]#the-websocket-session
6. [Engine abstraction]#engine-abstraction
7. [Job lifecycle (one claim end-to-end)]#job-lifecycle-one-claim-end-to-end
8. [Config + persisted state]#config--persisted-state
9. [Optional desktop UI]#optional-desktop-ui
10. [Auto-update]#auto-update
11. [Observability]#observability
12. [Service / autostart]#service--autostart
13. [Failure modes + reconnect policy]#failure-modes--reconnect-policy
14. [Security model]#security-model
15. [Studio side (minigames repo)]#studio-side-minigames-repo

---

## Two-binary big picture

```
  +-----------------+         WebSocket session (long-lived)
  |  studio-worker  | <----+---------------------------------+
  |   (Rust, this   |      |                                 |
  |     repo)       |      | + heartbeats every 5s           |
  |                 |      | + claim/offer/accept frames     |
  |                 |      | + completeJson / fail frames    |
  |                 |      | + log batches (1Hz)             |
  +-----------------+      |                                 |
       ^   ^               v                                 |
       |   |    +-------------------+                        |
       |   |    | studio Worker     |                        |
       |   |    | (Cloudflare,      |                        |
       |   |    | minigames repo)   |                        |
       |   |    +-------------------+                        |
       |   |          ^   ^                                  |
       |   |          |   |                                  |
       |   |          |   +--- D1: studioWorkers /           |
       |   |          |        workerRegistrationRequests /  |
       |   |          |        graphicsJobs / workerLogs     |
       |   |          |                                      |
       |   |          +--- React dashboard at                |
       |   |               studio.minis.gg                   |
       |   |                                                 |
       |   |    Bytes upload (HTTP multipart):               |
       |   +--- POST /workers/:id/jobs/:jobId/complete       |
       |                                                     |
       |    Auto-register + poll (HTTP):                     |
       +--- POST /workers/register-request                   |
            GET  /workers/register-requests/:id              |
                                                             |
            (operator approves in dashboard) ----------------+
```

The worker speaks **three** different surfaces to the studio:

| Channel | Lifetime | Carries |
|---|---|---|
| `POST /workers/register-request` + `GET /workers/register-requests/:id` | One-shot at install + 30s polling until approved | Operator-gated registration; mints `worker_id` + `auth_token` |
| WebSocket at `GET /workers/:id/connect` | Long-lived, reconnect on disconnect | Heartbeats, claim offers (carrying the [`ModelSource`]../runtime/model-source.md the worker needs to download + run the model), accept/reject, complete-json, fail, log batches |
| `POST /workers/:id/jobs/:jobId/complete` (multipart) | Per finished job with binary output | Image / audio / video bytes → R2 |

Everything else (heartbeat ack, accept, fail, log shipping, etc.) is
WebSocket frames — the legacy `/heartbeat`, `/claim`,
`/complete-json`, `/fail`, `/logs` HTTP routes are gone.

---

## Process lifecycle

```
main.rs (process entry)
   |
   v
tokio runtime + tracing/sentry init  (telemetry.rs)
   |
   v
cli.rs::Cli::parse   ->   lib.rs::run_cli  ->  match on Command
   |
   v
runtime.rs::run       (or ui::run, or one-shot helpers)
   |
   +--> 1. config::load              (config.rs)
   +--> 2. ensure_registered         (calls auto_register::tick in a loop until Approved)
   +--> 3. run_loops                 (spawns the WS session + auto-updater)
              |
              +--> ws::session::spawn_ws_session  (heartbeats, claim, complete, fail, logs)
              +--> runtime::spawn_auto_updater    (release-feed poll + re-exec)
```

The CLI surface from [`src/cli.rs`](../../src/cli.rs):

| Subcommand | What it does |
|---|---|
| `run` | Start the runtime: ensure registered, then the WS session + auto-updater |
| `ui` (feature `ui`) | Same as `run` but with the egui window + tray + notifications |
| `register` | Persist api-base-url / clear state (`--reset`).  **No HTTP** — the next `run`/`ui` actually auto-registers |
| `status` | Print config path, registration state, threshold, auto-update toggle |
| `set-threshold <gb>` | Update `vram_threshold_gb` |
| `install-service` / `uninstall-service` | Per-OS service file (systemd / launchd / scheduled task) |
| `config` | Dump the resolved config |
| `check-update` | One-shot release-feed poll, doesn't install |

---

## Source-tree map

```
src/
├── main.rs           Thin process entry; sets up tokio + sentry + tracing, dispatches to lib::run_cli.
├── lib.rs            Module re-exports + run_cli dispatch table.
├── cli.rs            clap definitions.  Tested in-module.
├── config.rs         Config struct + load/save (~/.config/minis-studio-worker/config.toml).
├── runtime.rs        run/run_loops/register/status/format_status, the auto-update tick,
│                     the ensure_registered helper, WorkerObservers, JobOutcome.
├── auto_register.rs  State machine (Pristine/Pending/Approved/Rejected) + tick().
│                     install_id + registration_secret generation; SHA-256 hashing.
├── http.rs           Thin reqwest::blocking wrapper.  Two methods left now:
│                     register_request + poll_register_status + complete (multipart).
├── types.rs          Wire types shared with the studio: WorkerCapabilities, Task*,
│                     TaskResult, JobClaim, LogEntry, AutoRegisterRequest, RegisterStatus.
├── sys.rs            hostname/username/VRAM probe.
├── service.rs        Per-OS service file writers (systemd --user / launchd / schtasks XML).
├── autostart.rs      Cross-OS "run in tray on login" toggle (logged; desktop UI calls it).
├── update.rs         GitHub release feed poll + installer script download + re-exec on success.
├── telemetry.rs      Sentry init (opt-in via SENTRY_DSN env var) + tracing-subscriber layer.
├── test_support.rs   #[doc(hidden)] tracing capture helper for integration tests.
│
├── engine/           Pluggable inference backends.
│   ├── mod.rs        Engine trait + dispatch / dispatch_with_source.  Always-on SyntheticEngine.
│   ├── multi.rs      MultiEngine; routes strictly by ModelSource.engine (no fallback).
│   ├── sdcpp.rs      Real image inference via stable-diffusion.cpp subprocess.
│   ├── llama.rs      (feature `llama`) llama-cpp-2 wrapper for LLM tasks.
│   ├── whisper.rs    (feature `whisper`) whisper-rs wrapper for STT.
│   ├── candle_image.rs (feature `image-candle`) candle-transformers SD pipeline.
│   ├── video.rs      (feature `video`) animated-GIF video stand-in (no ffmpeg).
│   └── tts.rs        (feature `tts`) pure-Rust formant-synth TTS stand-in.
│
├── ws/               Replaces the four old polling loops with one WS session.
│   ├── mod.rs        Re-exports.
│   ├── types.rs      WorkerInbound / WorkerOutbound frame enums (mirror TS contract).
│   ├── client.rs     tokio-tungstenite wrapper; connect/send/recv; WsClientError.
│   └── session.rs    spawn_ws_session: connect, hello, heartbeat, offer-handler,
│                     log-flush, reconnect with exponential backoff.
│
└── ui/               (feature `ui`) Native egui desktop window.
    ├── mod.rs        ui::run: load config, spawn auto-register + run_loops on tokio,
    │                 hand main thread to eframe.  Tray install (Linux ksni on tokio).
    ├── app.rs        eframe App impl: tab dispatch, shared state, hide-to-tray, quit.
    ├── tab.rs        Tab enum + STUDIO_WORKER_UI_TAB env override for screenshots.
    ├── tabs/
    │   ├── status.rs Initialising / Pending / Rejected / Registered view models.
    │   ├── jobs.rs   Current card + bounded recent-jobs ring.
    │   ├── config.rs Every Config field as a widget; Save writes through.
    │   ├── logs.rs   Level filter + free-text search + auto-scroll, windowed.
    │   └── about.rs  Version / sentry release / config path / Check for updates.
    ├── tray.rs       3-variant icon (idle/busy/disconnected), menu factory.
    └── notifier.rs   Trait + DesktopNotifier + per-event NotificationPrefs gate.
```

Pluggable engine backends are gated behind cargo features so the
default build stays small and CI fast.  See
[`plans/real-engines.md`](../../plans/real-engines.md) for the
per-feature build matrix.

---

## Registration (auto-register-with-approval)

**No shared secret ever leaves the studio.**  Every worker auto-registers
on first launch and waits for the operator to click Approve in the
studio dashboard.  Implemented across
[`src/auto_register.rs`](../../src/auto_register.rs),
[`src/types.rs`](../../src/types.rs), and
[`src/http.rs`](../../src/http.rs); orchestration in
[`src/runtime.rs::ensure_registered`](../../src/runtime.rs).

### State machine

```
                       +-----------------+
                       |    Pristine     |  ← first launch, between requests, or
                       +-----------------+    after `register --reset`
                                |
                                | tick: POST /workers/register-request
                                | (body: installId, registrationSecretHash,
                                |        capabilities, label?, userAgent)
                                v
                       +-----------------+
                       |    Pending      |  ← config now has request_id +
                       |  { request_id,  |    registration_secret; UI shows
                       |    since }      |    "Waiting for approval"
                       +-----------------+
                                |
                  tick: GET /workers/register-requests/:id
                  bearer = registration_secret
                                |
              +--------+--------+--------+
              |        |        |        |
              v        v        v        v
       (pending)  (approved) (rejected) (404)
              |        |        |        |
              |        v        v        +-> Pristine (stale id, recreate)
              |  Approved   Rejected
              |  + writes   { reason }
              |  worker_id  --> loop exits; UI shows reason
              |  + auth_token  --> user runs `register --reset`
              |  to disk
              v
       (next tick — no HTTP, fast-path returns Approved)
```

### Per-install identity

- `install_id` — UUIDv4 generated on first launch, persisted in
  `config.toml`.  Stable across worker restarts so the studio can dedup
  re-submissions (operator hasn't decided yet → re-post returns the
  existing `requestId`).
- `registration_secret` — 256 bits of randomness from `/dev/urandom`
  on unix.  Hex-encoded.  Stored locally; **only the SHA-256 hash**
  leaves the box (sent on the initial POST, then presented as the
  raw Bearer when polling).
- `registration_request_id``rr-<uuid>` returned by the studio.
  Both this and the secret are cleared on Approved / Rejected.

### Capabilities snapshot

Each `register-request` carries a full
[`WorkerCapabilities`](../../src/types.rs):

- `machineName`, `username` (host identity from `whoami`)
- `agentVersion` (from `Cargo.toml`)
- `engine` (`multi` — the dispatcher wrapping every compiled-in backend)
- `vramTotalGb` (probed from `/proc/driver/nvidia/gpus` on Linux; 0 elsewhere)
- `vramThresholdGb` (operator-set max GB per claim)
- `autoEnabled`, `autoStart` (operator toggles)
- `supportedModels` (flat list across all task kinds)
- `taskKinds` (image / llm / audio_stt / audio_tts / video)
- `supportedModelsPerKind` (per-kind breakdown)

The operator sees all of this in the dashboard's Pending Workers row
before deciding.

### Operator override

There is no operator override.  Even the studio owner registers via
the same Pending → Approve flow.  This is intentional:

- Removes the chicken-and-egg of "how does Webber bootstrap his own
  worker without distributing a token to himself".
- Single source of truth for `studioWorkers` rows; no
  bootstrap-token-minted-out-of-band hidden path.
- Auditable: `workerRegistrationRequests.decided_by` records the
  approving studio user.

---

## The WebSocket session

After auto-register succeeds, [`ws::session::spawn_ws_session`](../../src/ws/session.rs)
opens a single long-lived WebSocket at `GET /workers/:id/connect` and
the heartbeat / claim / complete / fail / log pipelines all flow over
it as JSON frames.

Wire format mirrors `apps/studio/src/shared/types/workerWs.ts`.
Defined in [`src/ws/types.rs`](../../src/ws/types.rs) as two enums:

| Direction | Frame | Carries |
|---|---|---|
| → server | `Hello` | `authToken` + capabilities (sent immediately after upgrade) |
| → server | `Heartbeat` | capabilities + current_job_id (every 5s) |
| → server | `Accept` | `jobId` (responding to an Offer) |
| → server | `Reject` | `jobId` + `reason` (engine can't serve this model/kind) |
| → server | `CompleteJson` | `jobId` + `result` JSON (LLM, STT) |
| → server | `Fail` | `jobId` + `error` + `retryable` |
| → server | `LogBatch` | drained log entries (every 1s) |
| → server | `ReadyForMore` | hint that backpressure has cleared |
| server → | `Welcome` | `workerId` + server time (post-Hello ack) |
| server → | `Offer` | `JobOfferClaim` (worker chooses Accept or Reject) |
| server → | `HeartbeatAck` | (per heartbeat) |
| server → | `CompleteAck` | `jobId` (post-CompleteJson) |
| server → | `FailAck` | `jobId` (post-Fail) |
| server → | `Error` | `code` + `message` (auth, protocol, duplicate, deleted) |

The `complete` route for image / audio / video bytes is a separate
HTTP multipart upload — R2 doesn't fit cleanly into WS frames.
Everything else stays on the session.

### Session loop

[`spawn_ws_session`](../../src/ws/session.rs) wraps
`run_one_session` in a reconnect loop:

```
attempt = 0
loop:
   if stop: return Stopped
   match run_one_session():
     Stopped       → return
     AuthFailed    → return (do not reconnect; user must --reset)
     Fatal(msg)    → return (e.g. duplicate worker, missing creds)
     Disconnected  → back off BASE_BACKOFF_MS * 2^attempt, capped at
                     MAX_BACKOFF_MS (30s).  attempt += 1.
                     Out of attempts (default 5) → return Err so the
                     service manager restarts the binary.
```

Constants live at the top of `ws/session.rs`:

| Constant | Value |
|---|---|
| `HEARTBEAT_INTERVAL` | 5s |
| `LOG_FLUSH_INTERVAL` | 1s |
| `SHUTDOWN_TICK` | 250ms |
| `BASE_BACKOFF_MS` | 1 000 |
| `MAX_BACKOFF_MS` | 30 000 |
| `DEFAULT_RECONNECT_ATTEMPTS` | 5 |

`cfg.ws_reconnect_attempts` overrides the default.

---

## Engine abstraction

[`src/engine/mod.rs`](../../src/engine/mod.rs) defines:

```rust
pub trait Engine: Send + Sync {
    fn name(&self) -> &'static str;
    fn capabilities(&self) -> EngineCapabilities;
    fn dispatch(&self, model: &str, task: Task) -> Result<TaskResult>;

    // Dispatch with the offer's ModelSource attached.  Engines that
    // need the download spec / CLI defaults (sdcpp) override it;
    // engines that don't (synthetic) inherit this default.
    fn dispatch_with_source(
        &self,
        model: &str,
        task: Task,
        _source: &ModelSource,
    ) -> Result<TaskResult> {
        self.dispatch(model, task)
    }
}
```

`TaskResult` is tagged by kind:

- `Image { bytes, ext }` (webp / png)
- `Llm { json }` (OpenAI-shape `chat.completion`)
- `AudioStt { json }` (whisper-shape segments)
- `AudioTts { bytes, ext }` (wav)
- `Video { bytes, ext }` (animated webp from synthetic, gif from the `video` feature)

Engines are no longer config-selectable.  `engine::build()` always
returns a `MultiEngine` populated with every backend compiled into
this binary; per-offer routing happens inside the MultiEngine and is
driven by the offer's `ModelSource.engine` field (see [Job
lifecycle](#job-lifecycle-one-claim-end-to-end)).

Built-in:

- **`synthetic`** — deterministic real bytes for every kind,
  keyed by SHA-256 of the prompt.  Real WEBP, real WAV, real animated
  WEBP, real OpenAI-shaped JSON.  No GPU, no model downloads, ~0ms
  per task.  Powers CI + smoke-tests.  Advertises only `synthetic*`
  model names so it never claims a real-model job (it would happily
  upload placeholder bytes for a real manifest, which is destructive
  on a live queue).
- **`sdcpp`** — real image inference via `stable-diffusion.cpp` as a
  subprocess.  Reads the `ModelSource` off every offer, downloads
  any missing files into `cfg.models_root`, invokes `sd-cli` with
  the right `--diffusion-model` / `--llm` / `--vae` flags + CLI
  defaults from the source.  Image kind only today.  Deep dive in
  [`docs/engines/sdcpp.md`]../engines/sdcpp.md.

The legacy `gradio` engine is gone (operators run a Gradio app via
an external service if they need it).  Feature-gated heavyweights
(`llama`, `whisper`, `image-candle`, `video`, `tts`) still drop in
via the same trait when their cargo features are enabled — see
[`plans/real-engines.md`](../../plans/real-engines.md).

---

## Job lifecycle (one claim end-to-end)

```
1. Studio queues a graphicsJobs row (status=queued, model=X, vram=Y)

2. Server picks a worker whose:
     - capabilities.supportedModels contains X
     - vramThresholdGb >= Y
     - last heartbeat fresh (< 30s)
   Model-name matching is gone — the studio attaches the download
   spec, the worker is dumb.  Server pushes an Offer frame down the
   WS session with the model + ModelSource included.

3. Worker receives Offer:
     - Sends Accept frame; sets busy flag; populates
       `observers.current_job` for the Jobs tab.
     - Hands the task to `engine.dispatch_with_source(model, task,
       source)` on a blocking thread.
     - The MultiEngine routes by `source.engine`; the sdcpp engine
       ensures every file in `source.files` is cached under
       `cfg.models_root` (downloading any missing ones), then runs
       `sd-cli` with the CLI defaults.
     - If the engine bails: sends `Fail { error, retryable }`.

4. Engine produces a TaskResult:
     - Image / AudioTts / Video → HTTP POST multipart to
       `/workers/:id/jobs/:jobId/complete` (R2 upload), then
       success log entry.
     - Llm / AudioStt → WS frame CompleteJson with the JSON payload.

5. Server marks job done, sends CompleteAck, populates
   graphicsJobs.completedAt + R2 key.

6. Worker:
     - Clears busy flag.
     - Pushes CurrentJob → RecentJob in the observers ring (UI Status
       + Jobs tabs surface this).

Server-driven offer pipeline: the next Offer comes from the studio's
`notifyJobCompleted` (defer'd from the multipart route's `waitUntil`),
not from the worker.  The worker no longer sends `ReadyForMore` —
the dual trigger raced the studio's `commitOffer` and produced
`protocol_violation: accept for unknown jobId` errors that killed
sessions.

If engine returns Err:
     - Worker sends Fail { error, retryable }.
     - Server requeues (retryable) or marks failed (terminal).
```

Rules worth pinning explicitly:

- **Selection is kind-based, not model-name-based.**  The studio's
  `pickWorkerForJob` and `findQueuedJobForWorker` filter on the
  worker's `taskKinds`.  Model-name whitelisting on the worker is
  gone (a brief `'*'` wildcard sentinel shipped + got reverted in
  the same session as the model registry; the registry approach is
  cleaner because the studio already knows everything about the
  model).
- **Only one Offer in flight per worker.**  Server-driven offer
  cadence as above; no worker-side `ReadyForMore`.
- **Hello waits for Welcome before starting heartbeat / log-shipper.**
  `tokio::interval()` ticks at t=0, so the first heartbeat used to
  race the studio's async Hello-auth flow and trip
  `protocol_violation: session not authenticated`.  The session
  loop now blocks on the Welcome reply before spawning the
  background pumps.
- **Worker waits for credentials before opening a session.**  The
  UI's parallel auto-register + WS-session flow used to race; the
  WS session now polls the shared config every second until
  `worker_id` + `auth_token` are populated, rather than
  fatal-bailing on first attempt.

The runtime tracks all three observable slots in
[`runtime::WorkerObservers`](../../src/runtime.rs):

- `current_job: Option<CurrentJob>` — set during dispatch
- `recent_jobs: VecDeque<RecentJob>` (cap 50, newest-first)
- `last_heartbeat: Option<HeartbeatStatus>` — written after every
  WS heartbeat ack / failure

These are `Arc<Mutex<…>>` and read directly by the UI for live state.

---

## Config + persisted state

[`src/config.rs`](../../src/config.rs) defines the persisted
`Config` struct.

**File location** (via the `directories` crate):

- Linux / macOS: `~/.config/minis-studio-worker/config.toml`
- Windows: `%APPDATA%\minis-studio-worker\config.toml`

**Operator-facing fields** (exposed in the UI's Config tab):

| Field | Default | Purpose |
|---|---|---|
| `api_base_url` | `https://studio.minis.gg/` | Studio API root |
| `vram_threshold_gb` | `12.0` | Max VRAM per claim |
| `auto_start` | `true` | OS service auto-start at boot |
| `auto_update_enabled` | `true` | Check the GitHub release feed |
| `auto_update_interval_secs` | `1800` | How often (default 30 min) |
| `auto_update_feed` | release URL | GitHub feed to poll |
| `auto_update_prerelease` | `false` | Track pre-releases |
| `models_root` | `~/models` (resolved at load) | Where downloaded model files live |

**Internal state** (persisted but not exposed in the UI; the
auto-register flow owns it end-to-end):

| Field | Purpose |
|---|---|
| `worker_id` | Filled on operator approval; presented in the WS URL path |
| `auth_token` | Filled on operator approval; presented in WS Hello + the multipart `complete` Bearer |
| `ws_reconnect_attempts` | WS session reconnect budget (defaults to `5` when unset) |
| `install_id` | Per-install UUID generated on first launch |
| `registration_request_id` | Set during Pending, cleared on Approved/Rejected |
| `registration_secret` | Same |

**Runtime-only** (not in the file at all):

| Flag | Where it lives | Purpose |
|---|---|---|
| `paused: Arc<AtomicBool>` | Top-level state passed into `runtime::run_loops` | Operator pause toggle.  When true, heartbeats advertise `autoEnabled = false` and incoming offers are rejected.  Restarts come up unpaused.  See [`docs/runtime/pause-resume.md`]../runtime/pause-resume.md. |

The legacy fields `engine`, `engines`, `gradio_endpoint_url`,
`supported_models_override`, `auto_enabled` and `label` are gone:
engine selection is automatic ([Engine abstraction](#engine-abstraction)),
the runtime pause flag replaces `auto_enabled`, and the studio's
Pending Workers panel no longer surfaces a label.

Every load + save emits a structured `tracing` event on the
`studio_worker::config` target with the resolved path — makes
"why is the worker reading the wrong config" trivially debuggable from
`journalctl`.  `auth_token` and `registration_secret` are
**deliberately omitted** from these events so logs ship off-box
without leaking credentials.

Coverage regression contract in
[`tests/config_tracing.rs`](../../tests/config_tracing.rs).

---

## Optional desktop UI

Built behind the `ui` cargo feature; brings in `egui` + `eframe` +
`notify-rust`, plus the platform tray backend: `tray-icon` on
macOS / Windows, `ksni` (pure-Rust StatusNotifierItem) on Linux, so the
build needs no GTK.  Off by default so the headless server install
stays lean.

### Tab structure

| Tab | What it shows |
|---|---|
| **Status** | Worker id, API URL, VRAM total / threshold, IDLE / BUSY / PAUSED badge, last heartbeat freshness, **Pause / Resume button** (flips the runtime `paused` flag).  When unregistered: Initialising / Pending (with request id + copy button) / Rejected (with reason + `--reset` hint) state. |
| **Jobs** | Current job card (kind, model, prompt preview, elapsed) + last 50 finished jobs with outcome / duration. |
| **Config** | The operator-facing subset of `Config` as widgets, grouped into Connection (API base URL) / Worker (VRAM threshold + Auto-start) / Auto-update / Models (folder picker for `models_root`) / Notifications / Background mode.  Save writes through; Reset reverts.  Internal state (`worker_id`, `auth_token`, `install_id`, registration ids) is deliberately not shown — the auto-register flow owns it. |
| **Logs** | Level filter (all/info/warn/error), free-text search across category/message/job id, auto-scroll toggle.  Reads from `WorkerObservers.recent_logs` (bounded 1000-entry ring) so it doesn't blank out when the WS log-shipper drains every second. |
| **About** | Version, Sentry release name, config path, manual "Check for updates" button. |

Screenshots in [`docs/screenshots/`](../screenshots/).

### Tray icon

Three coloured variants derived from `(busy, last_heartbeat)`:

- **Idle** — green; not busy + heartbeat fresh + ok
- **Busy** — amber; busy flag set
- **Disconnected** — red; heartbeat stale (> 3 × interval), missing,
  or returned an error

Menu: **Open Window** / **Pause / Resume** / **Quit**.  The label
flips between Pause and Resume based on the runtime `paused` flag.

Closing the window hides to the tray; loops keep running.  Quit comes
from the tray menu (signals `stop`, awaits in-flight job up to ~5s,
then exits).

**Per-OS backends** ([`src/ui/tray_host.rs`](../../src/ui/tray_host.rs)):
Linux uses **ksni** (pure-Rust StatusNotifierItem over zbus) so the
build needs no GTK; the tray runs on the tokio runtime and the menu
`activate` callbacks drive the shared `paused` / `quit` flags + an
egui repaint.  macOS / Windows use **tray-icon** (native APIs), built
on the eframe main thread, with menu events arriving through muda's
global `MenuEvent::receiver()` channel.  Either backend is
best-effort — the window UI works without a tray.

### Notifications

OS-native desktop notifications via `notify-rust`, gated behind a
`Notifier` trait so tests inject a `CapturingNotifier` and assert
what would have been shown.  Both completion and failure
notifications are off by default, opt-in per-event from the Config
tab.

---

## Auto-update

[`src/update.rs`](../../src/update.rs) + the `spawn_auto_updater`
loop in `runtime.rs`.

Every `auto_update_interval_secs` (default 30 min):

1. Confirm no job is in flight (the shared `busy: AtomicBool` from
   the WS session).
2. GET the configured `auto_update_feed` (GitHub Releases API by
   default).
3. Compare highest published semver to `AGENT_VERSION`.
4. If newer:
   - Download the per-platform cargo-dist installer script.
   - On Windows only: **park** the running exe first (rename to
     `<exe>.old` — NTFS allows renaming a running binary but not
     overwriting it, so without this the installer's `Copy-Item`
     fails with "file in use" every time).  After the installer
     runs, confirm a new binary landed at the original path; roll
     the rename back otherwise.  The parked file is removed on the
     next start (`update::cleanup_parked_artifact`).
   - Run the installer (overwrites the binary in place).
   - On unix: `execvp` the new binary, replacing this process.
   - On Windows: spawn the successor + exit, since `execvp` isn't
     a clean fit.

The flow short-circuits when `auto_update_enabled = false` or when
the worker is mid-job.  Between checks the idle wait is stop-aware: it
re-polls the shared `stop` flag every `AUTO_UPDATE_SHUTDOWN_TICK`
(default 250 ms) via `wait_with_stop`, so a SIGTERM / SIGINT during the
idle window stops the worker promptly instead of blocking
`run_loops`' join for a whole `auto_update_tick`.  The
`RealRunner::{download, run_installer}`
+ `restart_self` paths are tested through a fake `UpdateRunner`
trait — they're excluded from the 90% coverage gate
(`.cargo-llvm-cov.toml`).

---

## Observability

- **Local logs**: every `tracing` event is rendered through
  `tracing-subscriber::fmt` to stderr.  Filter via
  `RUST_LOG=studio_worker=debug` (or any of the per-target filters
  documented per module: `studio_worker::http`,
  `studio_worker::config`, `studio_worker::runtime`,
  `studio_worker::ws::session`, `studio_worker::ws::client`, etc.).
  The `studio_worker::ws::client` target carries transport-boundary
  breadcrumbs (connect / recv / send / close) so a dropped frame or a
  dead studio is never silent, even though the session discards recv
  errors and fires `let _ = sender.send(...)`.
- **Studio-side logs**: every tick of the worker pushes its log
  buffer over the WS LogBatch frame.  The studio drops them into the
  `workerLogs` D1 table; the dashboard's LogViewer renders them.
- **In-UI logs tab**: same buffer, virtualised view, level filter +
  search.
- **Sentry (opt-in)**: set `SENTRY_DSN` (and optionally
  `SENTRY_ENVIRONMENT`) before launch.  Captures panics, forwards
  `tracing::error!` events, attaches preceding `warn!` events as
  breadcrumbs.  Tags with `release = studio-worker@<version>` and
  `server_name = <hostname>`.  Performance tracing intentionally off.

---

## Service / autostart

Two distinct mechanisms:

### `studio-worker install-service` (headless background)

[`src/service.rs`](../../src/service.rs).  Writes a per-OS unit
file:

- Linux: `systemd --user` unit at
  `~/.config/systemd/user/minis-studio-worker.service`
- macOS: LaunchAgent plist at
  `~/Library/LaunchAgents/gg.minis.studio-worker.plist`
- Windows: `schtasks /Create` XML template (`%APPDATA%\\minis-studio-worker\\minis-studio-worker.task.xml`)
  — written but **not registered**, since CreateTrigger needs
  the operator to confirm.

`uninstall-service` removes them.  Tested in
[`tests/runtime_helpers.rs`](../../tests/runtime_helpers.rs)
under an `XDG_CONFIG_HOME` override.

### "Run in tray on login" (UI mode)

[`src/autostart.rs`](../../src/autostart.rs) (always compiled, like
`service.rs`; the desktop UI's Config tab is the only caller).  Toggle
in the Config tab's "Background mode" group.  Each enable/disable emits
a structured `tracing` event on target `studio_worker::autostart`.
Writes:

- Linux: `~/.config/autostart/studio-worker-ui.desktop`
- macOS: `~/Library/LaunchAgents/gg.minis.studio-worker-ui.plist`
- Windows: an `HKCU\Software\Microsoft\Windows\CurrentVersion\Run`
  registry value `studio-worker-ui` = `"<exe>" ui` (via `winreg`).
  The standard per-user autostart mechanism: no console flash, no admin
  rights, no COM.

The two mechanisms coexist; they install different artefacts.  Use
the service for headless rigs, the autostart toggle for desktop
contributors.

---

## Failure modes + reconnect policy

| Failure | Detection | Behaviour |
|---|---|---|
| `register-request` HTTP 5xx | `auto_register::tick` | Stay Pristine, log warn, retry on next tick |
| `register-request` rate-limited (429) | studio binding | Same as 5xx; the 30s poll cadence already respects backoff implicitly |
| `register-requests/:id` 404 | poll response | Drop stale `request_id` + secret from config, recreate on next tick |
| `register-requests/:id` 401 | poll response | Same as 404; the worker's secret doesn't match the row — only happens if config was tampered |
| WS connect refused / TLS error | `WsClientError::Transport` | Back off + reconnect, up to `ws_reconnect_attempts` |
| WS close code `4001 AuthFailed` | session loop | Stop reconnecting; user must `register --reset` |
| WS close code `4003 DuplicateWorker` | session loop | Stop reconnecting (another instance is connected with the same id) |
| WS close code `4004 WorkerDeleted` | session loop | Stop; the studio operator deleted us |
| WS protocol violation | session loop | Server sends `Error { code: ProtocolViolation }` then closes |
| Engine `dispatch` returns `UnsupportedKind` | runtime job-runner | `Fail { retryable: false }` — server moves the job to terminal failed |
| Engine `dispatch` returns generic `Err` | runtime job-runner | `Fail { retryable: true }` — server requeues |
| `complete` multipart 5xx | runtime job-runner | `Fail` so the server can retry |
| Auto-update download / install failure | `update::apply` | Log + leave worker running on the old version; try again next interval |
| Auto-update `execvp` failure (unix) | `update::restart_self` | Should never happen; if it does, exit 0 and let systemd restart |
| Offer without `ModelSource` to sdcpp engine | engine `dispatch_with_source` | `Fail { retryable: false }` with "requires a ModelSource on the offer" |
| Model file download fails | sdcpp `ensure_files` | `Fail { retryable: true }`; the next claim of the same job retries the download |
| `sd-cli` non-zero exit | sdcpp `dispatch_image` | `Fail { retryable: true }` with the last stderr line included so operators can spot OOM / driver issues quickly |
| `sd-cli` binary missing | sdcpp `ensure_sd_cli` (first image job) | The engine always registers and advertises `image`; on the first image job it resolves `sd-cli` or auto-provisions the prebuilt into `cfg.models_root/bin`.  If no prebuilt exists for the target or the download fails, the job `Fail`s with the install remedy |
| Vulkan loader (`libvulkan.so.1` / `vulkan-1.dll`) missing | sdcpp dispatch preflight | `Fail { retryable: true }` with the exact remedy (install `libvulkan1` + a GPU driver) instead of a cryptic `sd-cli` crash.  macOS uses Metal, so no Vulkan loader is involved |
| rustls 0.23+ CryptoProvider missing | first WSS handshake | Process panics on `crypto/mod.rs:249`.  Fix is `rustls::crypto::ring::default_provider().install_default()` once at startup; see [`src/main.rs`]../../src/main.rs |
| `worker_id` / `auth_token` missing at WS connect | `has_credentials` check | Session loop waits (polling cfg every 1s) instead of fatal-bailing.  Lets the UI's parallel auto-register + WS flow work. |
| Hello-without-Welcome race | `wait_for_welcome` gate | Block heartbeat + log-shipper spawn until the studio's Welcome reply arrives, so `tokio::interval()`'s t=0 first tick doesn't ship a heartbeat into an unauthenticated session |

All worker-side failures emit a structured `tracing::warn!` or
`error!` event before they're handled, so logs ship and Sentry
captures them.

---

## Security model

- **No shared secret distributed.**  Every worker generates its own
  256-bit `registration_secret`; only the SHA-256 hash leaves the
  box.  The studio operator gates each registration manually.
- **Per-worker auth tokens** minted server-side on approval (32 bytes
  hex, stored hashed in `studioWorkers`).  Worker presents the raw
  token in WS Hello + as Bearer on the multipart complete route.
- **No tokens logged**: `tracing` events at `studio_worker::config`
  redact `auth_token` and `registration_secret` (regression-tested
  in [`tests/config_tracing.rs`]../../tests/config_tracing.rs).
- **Rate limited at the edge**: the studio binds
  `REGISTER_REQUEST_RATE_LIMIT` (Cloudflare native rate limiter,
  10 req / 60s / source IP) to `POST /workers/register-request`.
- **Idempotent register-request dedup**: same `installId` from the
  same source IP returns the existing `requestId` instead of piling
  up rows.
- **Approve / reject is admin-only**: studio's Firebase auth +
  allowlist guards the dashboard.
- **Worker side reads `/dev/urandom` directly** on unix for the
  install_id + secret — no `rand` dep, smaller surface area.
- **Auto-update binary swap** runs the cargo-dist installer the same
  way the user did on first install — same HTTPS + checksum
  verification (cargo-dist's own).

---

## Studio side (minigames repo)

This repo is the worker.  The other half lives in
`webbertakken/minigames` under
`apps/studio/src/worker/modules/graphics`:

| Path | Role |
|---|---|
| `routes/workers.ts` | Mounts `workerAdminRoutes` (Firebase-auth'd dashboard) + `workerAgentRoutes` (unauth'd register-request + secret-auth'd poll) |
| `WorkerConnections/` | Cloudflare Durable Object that owns every connected worker's WS session.  Receives offers from the queue, fans them out by capability fit |
| `routes/queue.ts` | Job CRUD + the "promote pending to queued" admin flow |
| `workerAuth.ts` | `hashToken` / `mintToken` / `requireRegistrationSecret` / `requireWorkerToken` middlewares |
| `apps/studio/migrations/graphics/0013_worker_registration_requests.sql` | D1 schema for the pending queue |
| `apps/studio/src/client/modules/graphics/components/PendingWorkersPanel.tsx` | The dashboard panel where the operator clicks Approve / Reject |

Wire-format contract is mirrored on both sides; the TypeScript
declarations in `apps/studio/src/shared/types/{worker,workerWs}.ts`
are the source of truth, and [`src/types.rs`](../../src/types.rs) +
[`src/ws/types.rs`](../../src/ws/types.rs) are hand-written
mirrors with regression tests in
[`tests/ws_wire.rs`](../../tests/ws_wire.rs).