studio-worker

A single self-contained Rust binary that pulls image, LLM, audio (STT/TTS), and video jobs from the minis.gg studio API, runs them locally, and posts the results back.

Install the worker on any PC, register once, and it will hold a hibernatable WebSocket session to the studio API's WorkerConnections Durable Object. The studio pushes job offers over the socket as soon as they're queued; the worker accepts, runs the engine, and posts the result back the same way (or via a single HTTP multipart route for image / audio / video bytes). The worker also auto-updates itself between jobs.

  studio-worker binary <----- WebSocket -----> WorkerConnections DO <-> D1
         ^                                          ^
         |     HTTP multipart /complete             |
         +------------------------------------------+ (binary outputs only)

Replaces the previous push-based studio-proxy + cloudflared topology and the intermediate pull-based polling pipeline. All five legacy worker HTTP routes (heartbeat, claim, complete-json, fail, logs) are now WS frame types.

Tasks supported

Kind	Wire `kind`	Synthetic engine (default)	Real engine (planned)
Image	`image`	real WEBP / PNG via the `image` crate	`image-candle` / `gradio`
LLM	`llm`	OpenAI-shape JSON (`chat.completion`)	`llama` (llama.cpp)
Audio STT	`audio_stt`	Whisper-shape JSON	`whisper` (whisper.cpp)
Audio TTS	`audio_tts`	real WAV (sine wave keyed by hash(text))	`tts-piper`
Video	`video`	real WebP image (single-frame stand-in)	`video-ffmpeg`

The synthetic engine is the default and exercises the full pipeline end-to-end with no GPU, no model downloads, and ~0 ms per task — exactly what the unattended CI suite uses. Real high-performance backends (llama.cpp, whisper.cpp, candle, Piper, ffmpeg) are wired in via feature flags and are deferred to a follow-up iteration (the trait, contract, and dispatch are already in place).

Desktop UI (optional)

The worker also ships a native desktop window built on egui/eframe that surfaces every config knob, the live job in flight, the recent-jobs history, the rolling log tail, and a system-tray icon with Open / Pause-Resume / Quit. Disabled by default so the headless cargo install + the systemd / launchd service path stay free of GL / winit / dbus / GTK deps.

Enable with the ui cargo feature:

cargo install studio-worker --features ui
studio-worker ui

Five tabs:

Tab	What it shows
Status	Worker id, API URL, VRAM total + threshold, busy / idle / paused badge, last heartbeat age + outcome. When the worker isn't registered, an in-window Register form.
Jobs	Current job in flight (kind, model, prompt, elapsed time) + bounded ring of the last 50 finished jobs with completed / failed badges.
Config	Every `config.toml` field as an editable widget grouped into Connection / Worker / Engine / Auto-update / Models / Notifications / Background mode. Save writes through `config::save` and the runtime picks up new values on the next tick. Engine swaps surface a "restart required" banner.
Logs	Level filter (info / warn / error), free-text search across category / message / job id, auto-scroll toggle, windowed at the last 500 entries.
About	Version, Sentry release name, resolved config path, "Check for updates" button.

Status tab

The tray icon reflects state (idle = green, busy = amber, disconnected = red) and exposes:

Open Window — re-show the window after hide-to-tray.
Pause / Resume claiming — toggles auto_enabled, persisted to config.toml.
Quit — signals the runtime loops to stop, awaits any in-flight job briefly, then exits.

Closing the window hides it to the tray; the worker keeps running. For an autostart-on-login workflow, tick the Run in tray on login toggle on the Config tab (writes ~/.config/autostart/studio-worker-ui.desktop on Linux, a LaunchAgent plist on macOS, a marker file on Windows).

Linux build-time deps

The tray + notifications stack pulls in GTK + D-Bus. On a fresh Ubuntu / Debian box install:

sudo apt-get install -y \
  libgtk-3-dev \
  libdbus-1-dev \
  libxdo-dev \
  libayatana-appindicator3-dev

For the unattended ui builds in CI the same packages are installed by .github/workflows/checks.yml before cargo test --features ui. No extra deps are required on macOS / Windows.

Quick install

Linux / macOS

curl --proto '=https' --tlsv1.2 -LsSf \
  https://github.com/webbertakken/studio-worker/releases/latest/download/studio-worker-installer.sh | sh

Windows (PowerShell)

irm https://github.com/webbertakken/studio-worker/releases/latest/download/studio-worker-installer.ps1 | iex

From cargo

cargo install studio-worker

Each release ships pre-built binaries for:

x86_64-pc-windows-msvc
x86_64-unknown-linux-gnu
aarch64-unknown-linux-gnu
aarch64-apple-darwin
x86_64-apple-darwin

First run

# 1. Register with the studio API.
studio-worker register \
  --bootstrap-token <TOKEN> \
  --api-base-url https://studio.example.com

# 2. Install the auto-start service (systemd --user on Linux, launchd
#    on macOS, scheduled task on Windows).
studio-worker install-service

CLI subcommands

Subcommand	Purpose
`run`	Hold the WS session + auto-update loop.
`register`	One-shot register with the API. Idempotent.
`status`	Print the local config + heartbeat info.
`install-service`	Install the auto-start OS service.
`uninstall-service`	Remove the auto-start OS service.
`enable`	Set `auto_enabled = true` (resume claiming).
`disable`	Set `auto_enabled = false` (worker online but doesn't claim).
`set-threshold <gb>`	Set the max VRAM (GB) the worker is willing to claim per job.
`config`	Print the resolved config + its on-disk path.
`check-update`	Check the release feed for a newer version (does not install).

Configuration

Config lives at:

Linux/macOS — ~/.config/minis-studio-worker/config.toml
Windows — %APPDATA%\minis-studio-worker\config.toml

api_base_url        = "https://studio.example.com"
bootstrap_token     = "<used only at register>"
worker_id           = "<filled by register>"
auth_token          = "<filled by register>"
vram_threshold_gb   = 12.0                       # max GB per claim
auto_start          = true
auto_enabled        = true
engine              = "synthetic"                # or "gradio"

# Only used when engine = "gradio":
gradio_endpoint_url = "http://127.0.0.1:7860"

# Optional: only declare these models to the studio.
supported_models_override = []

# Auto-update — checks the release feed on the cadence below, applies
# updates only when no job is running, then re-execs the new binary.
auto_update_enabled       = true
auto_update_interval_secs = 1800
auto_update_feed          = "https://api.github.com/repos/webbertakken/studio-worker/releases"
auto_update_prerelease    = false

# WebSocket reconnect cap.  When the session drops the worker tries
# to reconnect with exponential backoff up to this many times before
# exiting non-zero (and letting systemd/launchd/Task-Scheduler
# restart it).  `0` = infinite.  Omit to use the default of 5.
ws_reconnect_attempts     = 5

Troubleshooting

Worker exits with ws auth failed: ... — the studio API rejected the auth token on the upgrade (HTTP 401) or via a close-code 4001 after a successful upgrade. The token was either revoked, the worker was deleted from the studio admin UI, or config.toml carries a stale token. Re-register: studio-worker register --bootstrap-token <TOKEN> --api-base-url <URL>.
Worker exits with ws reconnect cap reached — every reconnect attempt failed (DNS, TLS, or the API is down). Service manager will restart us; if it keeps happening, check the API is reachable from the worker host.

Engines

synthetic (default) — produces deterministic, real WEBP/PNG/WAV/JSON outputs keyed by SHA-256 of the prompt/text/input. No GPU required. Use for smoke-tests, CI, and end-to-end verification of every modality.
gradio — talks to a Gradio app running on 127.0.0.1 (image only). Drops the cloudflared tunnel entirely. Supply the local Gradio URL in gradio_endpoint_url and the models you've verified in supported_models_override.

Adding a real engine

Implement the Engine trait in src/engine.rs (see SyntheticEngine and GradioEngine for examples). An engine declares its capabilities (per-kind supported models) and a dispatch(model, task) -> TaskResult function. Wire it into engine::build() behind a cargo feature, e.g.:

[features]
llama = ["dep:llama-cpp-2"]

The trait is already kind-aware so a single binary can host multiple engines (one per modality).

VRAM threshold

The worker reports two numbers to the API:

vramTotalGb — physical VRAM on the host (probed from /proc/driver/nvidia on Linux; 0 when no NVIDIA GPU is present).
vramThresholdGb — the max estimated VRAM per claim, controlled by the operator via set-threshold or by editing config.toml.

The studio API only hands a job to a worker if job.vramGbEstimate ≤ worker.vramThresholdGb and job.model ∈ worker.supportedModels. Jobs that no worker can take stay queued until either a suitable worker appears or the operator cancels.

Auto-update

A dedicated background task polls the GitHub Releases feed every auto_update_interval_secs (default 30 min). When a higher semver is available the worker:

Confirms no job is currently in flight (per a shared busy flag).
Downloads the cargo-dist installer for the current platform.
Runs it (it overwrites the binary in place).
Re-execs itself so the new code takes over.

Set auto_update_enabled = false to opt out. Set auto_update_prerelease = true to track pre-releases.

Observability

The worker batches log entries every second and pushes them as a logBatch frame over the WS session. The DO ingests them into the workerLogs D1 table; the studio LogViewer reads them from there.

Sentry (opt-in)

The worker integrates with Sentry for crash + error reporting. Disabled by default — set the following env vars before launching to enable it:

Env var	Purpose
`SENTRY_DSN`	The project DSN. Telemetry stays off when unset.
`SENTRY_ENVIRONMENT`	Optional environment tag (defaults to `production`).

When enabled the worker:

captures panics automatically (sentry's default panic handler);
forwards tracing::error! events as Sentry events;
attaches preceding tracing::warn! events as breadcrumbs;
tags every event with the worker's release (= studio-worker@<crate version>, the Sentry-conventional namespaced form) and hostname (server_name).

No DSN is baked into the binary, so the public repo never carries credentials. Performance tracing is intentionally off — Sentry is used purely for error/crash visibility.

Development

cargo test           # 169 unit + integration tests
cargo clippy --tests -- -D warnings
cargo fmt --check
cargo llvm-cov --workspace \
  --ignore-filename-regex 'src/main\.rs$' \
  --summary-only

Coverage CI enforces ≥ 90% line coverage; current is 93.45%. Truly-untestable bits excluded from the gate:

src/main.rs — the CLI bootstrap (all logic lives in lib.rs).
update::RealRunner::{download, run_installer} — real network + process spawn (tested through the UpdateRunner trait with a fake).
update::restart_self — calls execvp, never returns.
sys::detect_vram_gb NVIDIA-specific branch — requires NVIDIA hardware.

Integration tests live under tests/:

tests/ws_wire.rs — round-trip tests for every WorkerInbound / WorkerOutbound frame against the TS contract.
tests/ws_client_contract.rs — the WS client against a live tokio-tungstenite server (upgrade headers, hello roundtrip, 401 → AuthFailed, close 4001 → AuthFailed, binary-frame rejection, close idempotency).
tests/ws_session_full_loop.rs — end-to-end walk: hello → welcome → LLM offer → accept + completeJson → STT offer → accept + completeJson → clean close.
tests/http_contract.rs — register + multipart complete (image
- audio) against wiremock.
tests/http_errors.rs — error-status paths for register + multipart complete plus the tracing-emission contract.
tests/gradio_engine.rs — GradioEngine code paths against a fake Gradio (incl. data-URL / relative-URL / object-with-url responses).
tests/multi_modal.rs — every TaskKind round-trips through the synthetic engine + decoders.
tests/auto_update.rs — release feed parsing + apply_with full flow.
tests/runtime_helpers.rs — one-shot CLI helpers via wiremock.
tests/runtime_ticks.rs — auto-update ticks + run_returns_when_aborted smoke test that exercises the AuthFailed exit path.

Release process

PRs merge to main with conventional-commit titles (feat:, fix:, docs:, etc. — enforced by the Commit lint workflow).
release-please opens a release PR that bumps the version and updates the changelog.
Merging the release PR creates a git tag.
The tag triggers the release.yml workflow (cargo-dist), which builds binaries for all supported targets and uploads them to the GitHub release alongside installer.sh + installer.ps1 one-liners.

Licence

MIT. See LICENSE.

studio-worker 0.2.0