# rust-job-queue-api-worker-system
[](https://colab.research.google.com/github/infinityabundance/Rust-Job-Queue-API-Worker-System/blob/main/notebooks/rust_job_queue_api_worker_system_colab.ipynb)
Copyright 2026 infinityabundance
A compact, production-shaped Rust job queue: Axum HTTP API, Tokio workers, Postgres `SELECT ... FOR UPDATE SKIP LOCKED` dequeue, retries with decorrelated jitter, idempotency, cooperative cancellation, OpenAPI, Prometheus metrics, Docker Compose, and reproducible tests/benchmarks.
Single Cargo package, two feature flags, two service binaries, and a one-shot migration runner.
## What this demonstrates
| Concurrent worker correctness | A 200-job / 8-worker integration test validates no duplicate processing in that bounded SKIP LOCKED scenario, using a separate transactional `processed_log` table. |
| API design | Axum routes, typed DTOs, idempotency-key handling, status filters, OpenAPI JSON, and Swagger UI. |
| Reliability semantics | Retry scheduling, stale-lock recovery, cooperative cancellation, DB constraints, and documented trade-offs. |
| Operations | Docker Compose, migration binary, structured tracing, API metrics, worker metrics, health check, and runbook. |
| Release hygiene | Feature-gating probes, MSRV check, docs build with warnings denied, RustSec audit, and `cargo publish --dry-run` in CI. |
## Evidence
- `cargo test --all-features --locked`: 46 tests in the current suite (12 lib unit tests, 33 integration tests, 1 doctest). Integration tests require Docker because they use testcontainers Postgres.
- CI workflow: [`ci.yml`](https://github.com/infinityabundance/Rust-Job-Queue-API-Worker-System/blob/main/.github/workflows/ci.yml).
- Architecture notes: [docs/architecture.md](docs/architecture.md).
- API reference: [docs/api.md](docs/api.md).
- Operational runbook: [docs/runbook.md](docs/runbook.md).
- Trade-offs and omissions: [docs/tradeoffs.md](docs/tradeoffs.md).
- Colab smoke/repro notebook: [notebooks/rust_job_queue_api_worker_system_colab.ipynb](notebooks/rust_job_queue_api_worker_system_colab.ipynb).
## Quickstart
```bash
docker compose up # boots postgres + migrations + api + worker
open http://localhost:8080/docs # Swagger UI
curl http://localhost:8080/metrics # API Prometheus metrics
curl http://localhost:9091/metrics # worker Prometheus metrics
cargo test --all-features --locked # full test suite; needs Docker
```
Environment variables are documented in [`.env.example`](https://github.com/infinityabundance/Rust-Job-Queue-API-Worker-System/blob/main/.env.example).
## Endpoints
| POST | `/jobs` | 8080 | Enqueue. Supports `Idempotency-Key` header. |
| GET | `/jobs/{id}` | 8080 | Fetch one. |
| GET | `/jobs` | 8080 | Paginated list, filterable by `status` and `kind`. |
| POST | `/jobs/{id}/cancel` | 8080 | 200 cancelled, 202 running and pending, 409 terminal. |
| GET | `/health` | 8080 | DB-level liveness. |
| GET | `/metrics` | 8080 | API Prometheus metrics. |
| GET | `/metrics` | 9091 | Worker Prometheus metrics on the worker process. |
| GET | `/docs` | 8080 | Swagger UI. |
| GET | `/api-docs/openapi.json` | 8080 | OpenAPI 3 spec. |
Full endpoint reference: [docs/api.md](docs/api.md).
## Architecture in one paragraph
The core modules (`domain`, `payload`, `db`, `queue`, `retry`, `ids`, `error`) are always compiled; `api` and `worker` are gated behind features of the same names. The API and worker binaries use `required-features`, so they cannot be built without their dependency sets. A third binary, `job-queue-migrate`, runs embedded migrations and exits; Docker Compose uses it before starting the API and worker.
The queue lives in one Postgres table (`jobs`) with a partial index on `(run_at)` filtered by `status IN ('queued','retrying')`. Retries are scheduled by setting `run_at` into the future; the same dequeue query picks them up when `run_at <= now()`. Cancellation is atomic for queued/retrying jobs and cooperative for running jobs.
Full design, including state and sequence diagrams: [docs/architecture.md](docs/architecture.md).
## Feature flags
| core | always | Domain types, payload validation, queue SQL, retry math, migrations, `utoipa` schema derives. |
| `api` | yes | `axum`, `tower`, `tower-http`, `utoipa-swagger-ui`, `metrics-exporter-prometheus`. |
| `worker` | yes | `tokio-util` and `metrics-exporter-prometheus`. |
Consumers who only want the queue types and SQL helpers can use:
```bash
cargo add rust-job-queue-api-worker-system --no-default-features
```
CI checks core-only, API-only, and worker-only builds.
## Tests
```bash
cargo test --all-features --locked
```
The suite currently has 46 tests across lib unit tests, one doctest, and seven integration binaries:
- `queue_ops` (19): SKIP LOCKED dequeue, retry math, cancel paths, recovery sweep, DB-layer idempotency.
- `api_jobs` (5): create/get/list, 404, 422, `/health`, OpenAPI rendering.
- `idempotency` (4): two-concurrent same-key, 10-concurrent same-key race, different keys, header precedence over body.
- `retry_and_cancel` (2): fail-once-then-succeed, always-fail-to-permanent.
- `worker_dequeue` (1): enqueue -> worker processes -> succeeded.
- `cancel_running` (1): cancellation observed at a safe point.
- `concurrency_two_workers` (1): 200 jobs across 8 workers, no duplicate processing observed by the test harness.
- lib unit tests (12): payload validation, retry jitter bounds, deterministic simulator hash.
- doctest (1): crate-level example in [src/lib.rs](src/lib.rs).
All integration tests use testcontainers; each test gets an isolated ephemeral database.
## Benchmarks
Two release-supported benches are reproducible against a fresh Postgres container in Docker.
```bash
# Single-worker queue-overhead microbench (Criterion).
cargo bench --bench dequeue --features worker
# Concurrent throughput across {1,2,4,8,16} workers and 3 Postgres configs.
./bench/run.sh
```
Measured results are in [bench/RESULTS.md](bench/RESULTS.md). The numbers are scoped to the recorded hardware and methodology. They are not presented as production throughput claims.
The exploratory head-to-head comparison benchmark is intentionally excluded from the v0.1 crates.io package until its methodology and teardown behavior are corrected.
## Example: embed the worker in your own application
```bash
DATABASE_URL=postgres://localhost/jobs \
cargo run --example embed_worker --features worker
```
See [examples/embed_worker.rs](examples/embed_worker.rs) for a small example of plugging a custom `Executor` into `WorkerRuntime`.
## CI
[`ci.yml`](https://github.com/infinityabundance/Rust-Job-Queue-API-Worker-System/blob/main/.github/workflows/ci.yml) runs:
1. `cargo fmt --check`, clippy with warnings denied, feature-decoupling checks, full tests, release build, docs with warnings denied, `cargo package --list --locked`, and `cargo publish --dry-run --locked`.
2. `cargo +1.88.0 check --all-features --locked` for the declared MSRV.
3. RustSec audit.
## Trade-offs and intentional omissions
This artifact is deliberately small. The decisions and "what would change in production" notes live in [docs/tradeoffs.md](docs/tradeoffs.md). Start there if you are evaluating engineering judgment.
## Tech stack
Rust (MSRV 1.88), Tokio 1, Axum 0.8, sqlx 0.8, Postgres 16, tower-http 0.6, tracing 0.1, utoipa 5 + utoipa-swagger-ui 9, metrics + metrics-exporter-prometheus 0.18, thiserror 2, anyhow 1, testcontainers 0.27, Criterion 0.5.
## License
Apache-2.0