studio-worker
A single self-contained Rust binary that pulls image, LLM, audio (STT/TTS), and video jobs from the minis.gg studio API, runs them locally, and posts the results back.
Install the worker on any PC, register once, and it will hold a
hibernatable WebSocket session to the studio API's
WorkerConnections Durable Object. The studio pushes job offers over
the socket as soon as they're queued; the worker accepts, runs the
engine, and posts the result back the same way (or via a single HTTP
multipart route for image / audio / video bytes). The worker also
auto-updates itself between jobs.
studio-worker binary <----- WebSocket -----> WorkerConnections DO <-> D1
^ ^
| HTTP multipart /complete |
+------------------------------------------+ (binary outputs only)
Replaces the previous push-based studio-proxy + cloudflared topology
and the intermediate pull-based polling pipeline. All five legacy
worker HTTP routes (heartbeat, claim, complete-json, fail,
logs) are now WS frame types.
Tasks supported
| Kind | Wire kind |
Synthetic engine (default) | Real engine (planned) |
|---|---|---|---|
| Image | image |
real WEBP / PNG via the image crate |
image-candle / gradio |
| LLM | llm |
OpenAI-shape JSON (chat.completion) |
llama (llama.cpp) |
| Audio STT | audio_stt |
Whisper-shape JSON | whisper (whisper.cpp) |
| Audio TTS | audio_tts |
real WAV (sine wave keyed by hash(text)) | tts-piper |
| Video | video |
real WebP image (single-frame stand-in) | video-ffmpeg |
The synthetic engine is the default and exercises the full pipeline end-to-end with no GPU, no model downloads, and ~0 ms per task — exactly what the unattended CI suite uses. Real high-performance backends (llama.cpp, whisper.cpp, candle, Piper, ffmpeg) are wired in via feature flags and are deferred to a follow-up iteration (the trait, contract, and dispatch are already in place).
Desktop UI (optional)
The worker also ships a native desktop window built on egui/eframe
that surfaces every config knob, the live job in flight, the
recent-jobs history, the rolling log tail, and a system-tray icon
with Open / Pause-Resume / Quit. Disabled by default so the headless
cargo install + the systemd / launchd service path stay free of GL
/ winit / dbus / GTK deps.
Enable with the ui cargo feature:
Five tabs:
| Tab | What it shows |
|---|---|
| Status | Worker id, API URL, VRAM total + threshold, busy / idle / paused badge, last heartbeat age + outcome. When the worker isn't registered, an in-window Register form. |
| Jobs | Current job in flight (kind, model, prompt, elapsed time) + bounded ring of the last 50 finished jobs with completed / failed badges. |
| Config | Every config.toml field as an editable widget grouped into Connection / Worker / Engine / Auto-update / Models / Notifications / Background mode. Save writes through config::save and the runtime picks up new values on the next tick. Engine swaps surface a "restart required" banner. |
| Logs | Level filter (info / warn / error), free-text search across category / message / job id, auto-scroll toggle, windowed at the last 500 entries. |
| About | Version, Sentry release name, resolved config path, "Check for updates" button. |

The tray icon reflects state (idle = green, busy = amber, disconnected = red) and exposes:
- Open Window — re-show the window after hide-to-tray.
- Pause / Resume claiming — toggles
auto_enabled, persisted toconfig.toml. - Quit — signals the runtime loops to stop, awaits any in-flight job briefly, then exits.
Closing the window hides it to the tray; the worker keeps running.
For an autostart-on-login workflow, tick the Run in tray on login
toggle on the Config tab (writes ~/.config/autostart/studio-worker-ui.desktop
on Linux, a LaunchAgent plist on macOS, a marker file on Windows).
Linux build-time deps
The tray + notifications stack pulls in GTK + D-Bus. On a fresh Ubuntu / Debian box install:
For the unattended ui builds in CI the same packages are installed
by .github/workflows/checks.yml before cargo test --features ui.
No extra deps are required on macOS / Windows.
Quick install
Linux / macOS
|
Windows (PowerShell)
irm https://github.com/webbertakken/studio-worker/releases/latest/download/studio-worker-installer.ps1 | iex
From cargo
Each release ships pre-built binaries for:
x86_64-pc-windows-msvcx86_64-unknown-linux-gnuaarch64-unknown-linux-gnuaarch64-apple-darwinx86_64-apple-darwin
First run
# 1. Register with the studio API.
# 2. Install the auto-start service (systemd --user on Linux, launchd
# on macOS, scheduled task on Windows).
CLI subcommands
| Subcommand | Purpose |
|---|---|
run |
Hold the WS session + auto-update loop. |
register |
One-shot register with the API. Idempotent. |
status |
Print the local config + heartbeat info. |
install-service |
Install the auto-start OS service. |
uninstall-service |
Remove the auto-start OS service. |
enable |
Set auto_enabled = true (resume claiming). |
disable |
Set auto_enabled = false (worker online but doesn't claim). |
set-threshold <gb> |
Set the max VRAM (GB) the worker is willing to claim per job. |
config |
Print the resolved config + its on-disk path. |
check-update |
Check the release feed for a newer version (does not install). |
Configuration
Config lives at:
- Linux/macOS —
~/.config/minis-studio-worker/config.toml - Windows —
%APPDATA%\minis-studio-worker\config.toml
= "https://studio.example.com"
= "<used only at register>"
= "<filled by register>"
= "<filled by register>"
= 12.0 # max GB per claim
= true
= true
= "synthetic" # or "gradio"
# Only used when engine = "gradio":
= "http://127.0.0.1:7860"
# Optional: only declare these models to the studio.
= []
# Auto-update — checks the release feed on the cadence below, applies
# updates only when no job is running, then re-execs the new binary.
= true
= 1800
= "https://api.github.com/repos/webbertakken/studio-worker/releases"
= false
# WebSocket reconnect cap. When the session drops the worker tries
# to reconnect with exponential backoff up to this many times before
# exiting non-zero (and letting systemd/launchd/Task-Scheduler
# restart it). `0` = infinite. Omit to use the default of 5.
= 5
Troubleshooting
- Worker exits with
ws auth failed: ...— the studio API rejected the auth token on the upgrade (HTTP 401) or via a close-code 4001 after a successful upgrade. The token was either revoked, the worker was deleted from the studio admin UI, orconfig.tomlcarries a stale token. Re-register:studio-worker register --bootstrap-token <TOKEN> --api-base-url <URL>. - Worker exits with
ws reconnect cap reached— every reconnect attempt failed (DNS, TLS, or the API is down). Service manager will restart us; if it keeps happening, check the API is reachable from the worker host.
Engines
synthetic(default) — produces deterministic, real WEBP/PNG/WAV/JSON outputs keyed by SHA-256 of the prompt/text/input. No GPU required. Use for smoke-tests, CI, and end-to-end verification of every modality.gradio— talks to a Gradio app running on127.0.0.1(image only). Drops the cloudflared tunnel entirely. Supply the local Gradio URL ingradio_endpoint_urland the models you've verified insupported_models_override.
Adding a real engine
Implement the Engine trait in src/engine.rs (see SyntheticEngine
and GradioEngine for examples). An engine declares its capabilities
(per-kind supported models) and a dispatch(model, task) -> TaskResult
function. Wire it into engine::build() behind a cargo feature, e.g.:
[]
= ["dep:llama-cpp-2"]
The trait is already kind-aware so a single binary can host multiple engines (one per modality).
VRAM threshold
The worker reports two numbers to the API:
vramTotalGb— physical VRAM on the host (probed from/proc/driver/nvidiaon Linux;0when no NVIDIA GPU is present).vramThresholdGb— the max estimated VRAM per claim, controlled by the operator viaset-thresholdor by editingconfig.toml.
The studio API only hands a job to a worker if job.vramGbEstimate ≤ worker.vramThresholdGb and job.model ∈ worker.supportedModels.
Jobs that no worker can take stay queued until either a suitable worker
appears or the operator cancels.
Auto-update
A dedicated background task polls the GitHub Releases feed every
auto_update_interval_secs (default 30 min). When a higher semver is
available the worker:
- Confirms no job is currently in flight (per a shared
busyflag). - Downloads the cargo-dist installer for the current platform.
- Runs it (it overwrites the binary in place).
- Re-execs itself so the new code takes over.
Set auto_update_enabled = false to opt out. Set
auto_update_prerelease = true to track pre-releases.
Observability
The worker batches log entries every second and pushes them as a
logBatch frame over the WS session. The DO ingests them into the
workerLogs D1 table; the studio LogViewer reads them from there.
Sentry (opt-in)
The worker integrates with Sentry for crash + error reporting. Disabled by default — set the following env vars before launching to enable it:
| Env var | Purpose |
|---|---|
SENTRY_DSN |
The project DSN. Telemetry stays off when unset. |
SENTRY_ENVIRONMENT |
Optional environment tag (defaults to production). |
When enabled the worker:
- captures panics automatically (
sentry's default panic handler); - forwards
tracing::error!events as Sentry events; - attaches preceding
tracing::warn!events as breadcrumbs; - tags every event with the worker's
release(=studio-worker@<crate version>, the Sentry-conventional namespaced form) and hostname (server_name).
No DSN is baked into the binary, so the public repo never carries credentials. Performance tracing is intentionally off — Sentry is used purely for error/crash visibility.
Development
Coverage CI enforces ≥ 90% line coverage; current is 93.45%. Truly-untestable bits excluded from the gate:
src/main.rs— the CLI bootstrap (all logic lives inlib.rs).update::RealRunner::{download, run_installer}— real network + process spawn (tested through theUpdateRunnertrait with a fake).update::restart_self— callsexecvp, never returns.sys::detect_vram_gbNVIDIA-specific branch — requires NVIDIA hardware.
Integration tests live under tests/:
tests/ws_wire.rs— round-trip tests for everyWorkerInbound/WorkerOutboundframe against the TS contract.tests/ws_client_contract.rs— the WS client against a live tokio-tungstenite server (upgrade headers, hello roundtrip, 401 → AuthFailed, close 4001 → AuthFailed, binary-frame rejection, close idempotency).tests/ws_session_full_loop.rs— end-to-end walk: hello → welcome → LLM offer → accept + completeJson → STT offer → accept + completeJson → clean close.tests/http_contract.rs— register + multipartcomplete(image- audio) against wiremock.
tests/http_errors.rs— error-status paths for register + multipartcompleteplus the tracing-emission contract.tests/gradio_engine.rs— GradioEngine code paths against a fake Gradio (incl. data-URL / relative-URL / object-with-url responses).tests/multi_modal.rs— every TaskKind round-trips through the synthetic engine + decoders.tests/auto_update.rs— release feed parsing + apply_with full flow.tests/runtime_helpers.rs— one-shot CLI helpers via wiremock.tests/runtime_ticks.rs— auto-update ticks +run_returns_when_abortedsmoke test that exercises the AuthFailed exit path.
Release process
- PRs merge to
mainwith conventional-commit titles (feat:,fix:,docs:, etc. — enforced by the Commit lint workflow). release-pleaseopens a release PR that bumps the version and updates the changelog.- Merging the release PR creates a git tag.
- The tag triggers the
release.ymlworkflow (cargo-dist), which builds binaries for all supported targets and uploads them to the GitHub release alongsideinstaller.sh+installer.ps1one-liners.
Licence
MIT. See LICENSE.