cellos-server
The CellOS HTTP control plane — a thin projection over JetStream CloudEvents that admits formations, lists state, and streams events to clients.
What it is
cellos-server is the operator-facing API. It exposes a small REST surface
(POST /v1/formations, GET /v1/formations[/{id}], DELETE /v1/formations/{id}, GET /v1/cells[/{id}]) and a single WebSocket
endpoint (GET /ws/events) that streams CloudEvents in real time. The
server is built with axum 0.7 + tower-http 0.5 on top of async-nats and
async-nats::jetstream.
It sits at L7 of the layer model — above the supervisor and the event log,
below cellctl. The architectural contract is from CHATROOM.md Session 16
and ADR-0011: cellos-server is a pure-state-machine projection over
the JetStream CELLOS_EVENTS stream. The in-memory registry
(AppState::formations, AppState::cells) is a cache for query latency
only — it MUST be rebuildable by replaying cellos.events.> from
sequence 1. HTTP is the query interface; WebSocket is the live projection
feed; NATS is the source of truth.
What cellos-server deliberately does NOT do:
- It does not run cells (that is
cellos-supervisor). - It does not own state of its own — the registry is a derived projection.
- It does not serve a UI bundle. Per ADR-0017, the web view is served by
cellctl webuias a localhost reverse proxy; theServeDirfallback that lived here in early drafts is gone, and unmatched paths return 404 (src/lib.rs:30). - It does not authorise browser writes. ADR-0016 enforces the read-only
browser boundary structurally: the CORS layer (
src/lib.rs:59) only advertisesGETandOPTIONS, so even a compromised in-page script that slipped past the cellctl-webui proxy is refused at preflight by any compliant browser.
Public API surface
The crate is mostly an axum binary; the library surface is the seam used by integration tests and future embedders.
router(state: AppState) -> Router— assemble the full axum router with all canonical routes mounted.src/lib.rs:39.AppState— clonable per-request state (NATS client, JetStream context, formations/cells registries, API token, applied cursor).src/state.rs:26.AppState::new(nats, api_token)— constructor used by bothmain.rsand the test harness.src/state.rs:58.AppState::with_jetstream(ctx)— attach the JetStream context afterensure_streamsucceeds.src/state.rs:72.AppState::cursor()/bump_cursor(seq)— the ADR-0015 §D2 cursor.src/state.rs:78.state::CellRecord— the per-cell projection row.src/state.rs:233.state::FormationRecord— the per-formation projection row.src/state.rs:222.state::FormationStatus— the formation state-machine enum (PENDING,LAUNCHING,RUNNING,DEGRADED,COMPLETED,FAILED).src/state.rs:210.state::ApplyOutcome— the result of applying a CloudEvent to the projection.src/state.rs:195.jetstream::STREAM_NAME/STREAM_SUBJECT— theCELLOS_EVENTSstream binding (cellos.events.>).src/jetstream.rs:65.jetstream::ensure_stream(&Client)— best-effort create-or-attach of the durable JetStream stream.src/jetstream.rs:94.jetstream::replay_projection(&AppState, &Context)— replay events from sequence 1 to rebuild the projection cache.src/jetstream.rs:185.jetstream::open_ws_message_stream(...)— open the per-connection message stream that backs/ws/events.src/jetstream.rs:272.ws::ws_events— WebSocket handler.src/ws.rs:73.ws::WsParams— the?subject=+?since=query parameters.src/ws.rs:60.routes::formations::*,routes::cells::*— the HTTP handlers; not intended for direct re-use but documented here for reference.
The bearer-token contract (Authorization: Bearer <api_token>) is
enforced in src/auth.rs and called from every handler before any state
access.
Architecture / how it works
┌──────────────┐
│ cellos │ ──► HTTP/WS over loopback
└──────┬───────┘
│
▼
┌──────────────┐ ┌────────────────────────────┐
│ cellos-server│◄─────│ JetStream (CELLOS_EVENTS) │
│ (axum) │──────►│ subject: cellos.events.> │
└──────┬───────┘ └────────────────────────────┘
│ ▲
│ │
in-memory ▼ │
projection cache cellos-supervisor publishes
(BTreeMap, RwLock) every lifecycle / observability /
identity / policy CloudEvent here
Startup flow (src/main.rs):
- Read
CELLOS_SERVER_BIND,CELLOS_NATS_URL, and the requiredCELLOS_SERVER_API_TOKEN(fail-closed: empty/unset → refuse to start). - Best-effort connect to NATS. A broker outage at startup is not fatal — the HTTP query interface MUST serve cached state precisely when the event log is unhealthy, so operators can inspect the system. WebSocket clients see an immediate close until the broker returns.
- Call
ensure_streamto bind theCELLOS_EVENTSdurable stream, then callreplay_projectionto rebuildAppState.formationsandAppState.cellsfrom sequence 1 (ADR-0011 §Consequences). TheCELLOS_SERVER_SKIP_REPLAYenv var bypasses this for tests. - Bind the listener, mount
router(state), runaxum::servewith graceful shutdown on SIGTERM/SIGINT.
The WebSocket bridge (src/ws.rs) accepts ?since=<seq> per ADR-0015
§D3 and emits a JSON envelope {"seq": N, "event": {...}} per frame
(src/ws.rs:1). A 25-second Ping heartbeat (ADR-0015 §D6) keeps the
connection alive across NAT timeouts and lets the client detect a dead
upstream within the 45s budget the web view tolerates
(src/ws.rs:HEARTBEAT).
The CORS layer in router() (src/lib.rs:59) advertises only GET and
OPTIONS. A unit test (cors_preflight_for_post_does_not_allow_post,
src/lib.rs:87) asserts the ADR-0016 structural enforcement.
Configuration
| Env var | Default | Effect |
|---|---|---|
CELLOS_SERVER_BIND |
127.0.0.1:8080 |
TCP listen address. |
CELLOS_NATS_URL |
nats://127.0.0.1:4222 |
Broker URL. Outage at startup is non-fatal. |
CELLOS_SERVER_API_TOKEN |
(required) | Bearer token for every route. Server refuses to start when unset or empty. |
CELLOS_SERVER_SKIP_REPLAY |
unset | When 1/true, skip the ADR-0011 replay-on-boot. |
RUST_LOG / EnvFilter |
info |
tracing-subscriber filter. JSON output is on by default. |
Examples
Mount the router into a test:
use Body;
use ;
use ;
use ServiceExt;
async
Run the binary against a local broker:
CELLOS_SERVER_BIND=127.0.0.1:8080 \
CELLOS_NATS_URL=nats://127.0.0.1:4222 \
CELLOS_SERVER_API_TOKEN= \
RUST_LOG=info \
Stream events:
# or directly:
Testing
Most tests drive the axum router via tower::ServiceExt::oneshot and
need no broker. The integration tests under crates/cellos-server/tests/
include formation_authority_invariant.rs (ADR-0010 §Enforcement —
exercises all four rejection paths through POST /v1/formations) and
signed_envelope_round_trip.rs. The JetStream-dependent paths
(replay_projection, the WS bridge with a live consumer) are exercised
in workspace-level integration tests; running them requires a local
NATS with JetStream enabled (nats-server -js).
Related crates
cellos-core— ownsCloudEventV1, formation/lifecycle event builders, and the spec validators consumed byroutes::formations.cellos-supervisor— the producer of every CloudEvent this server projects.cellos-projector— the offline equivalent ofreplay_projectionfor audit work.cellos-ctl— the operator client and the read-only browser proxy in front of this server.
ADRs
- ADR-0001 — NATS JetStream as the proprietary host substrate.
- ADR-0010 —
formation admission invariant; enforced in
routes::formations. - ADR-0011 — this crate; defines the projection-cache + replay-on-boot contract.
- ADR-0014 — the formation event family the server emits and projects.
- ADR-0015 —
WebSocket cursor (
seq),?since=resume, and heartbeat. - ADR-0016 — read-only browser boundary, enforced by the CORS allow-methods list.
- ADR-0017 —
the static-bundle responsibility moved to
cellctl webui; this server is API-only.