rust-job-queue-api-worker-system 0.1.0

# Tradeoffs

This document records what was deliberately *not* built and the reasoning behind each omission. The goal is to make the scope and trade-offs explicit, not to apologise for incompleteness.

The brief was "small in scope, exquisite in execution." Every line below could have been a feature; each was a decision to stop.

---

## 1. No auth, no RBAC, no rate limits

The API is open. Any client can POST a job or cancel one. This is fine for a demo but obviously wrong for production.

*What would change in production:* a thin auth middleware (JWT via `tower-http::auth` or a custom layer) checking either a service-to-service signed token or a user JWT. RBAC layered on top: who can list whose jobs. Rate limits via `tower-governor`. None of that changes the queue's design, so it would be additive.

## 2. No multi-tenancy

One tenant. The `jobs` table has no `tenant_id` column and no row-level security.

*What would change in production:* add `tenant_id UUID NOT NULL`, a partial index `(tenant_id, run_at) WHERE status IN ('queued','retrying')`, and a `WHERE tenant_id = $1` clause in every API and worker query. The SKIP LOCKED pattern still works unchanged. Choosing between RLS (Postgres-enforced) and application-level filtering is the only real design call here.

## 3. No job priorities; no schedule-at-arbitrary-time

All jobs are FIFO by `run_at`. There is no priority column. The only way `run_at` is set into the future is by the retry path.

*What would change in production:* add a `priority SMALLINT NOT NULL DEFAULT 100` column and a `(status, priority, run_at)` partial index. The dequeue's `ORDER BY` becomes `ORDER BY priority, run_at`. For scheduled-at-arbitrary-time, accept a `scheduled_for` field in `POST /jobs` and set `run_at = scheduled_for`. Both are one-line additions; neither has been worth adding to a demo.

## 4. No queue partitioning / sharding

One Postgres, one `jobs` table. The 200-job / 8-worker integration test gives bounded evidence for the SKIP LOCKED dequeue path, but a single Postgres caps out somewhere in the low thousands of jobs/sec depending on hardware.

*What would change in production:* either (a) shard by `tenant_id % N` across multiple databases — easy because the dequeue is stateless — or (b) move hot kinds to dedicated tables with their own indices. Both are common and well-understood. The codebase would need a routing layer in `core::queue` but no change in invariants.

## 5. No dead-letter table

`failed_permanent` rows stay in the `jobs` table forever. There is no separate `dead_letter_queue` and no automatic eviction.

*What would change in production:* either a periodic sweeper job that moves `failed_permanent` rows older than N days to a `dead_letter_queue` table (preserving them for forensic analysis), or a TTL-based partitioning scheme. For a demo, having one fewer table is the right call.

## 6. No LISTEN/NOTIFY

Workers poll the queue at 500 ms (backing off to 2 s when idle). `pg_notify` could shave off the polling latency on otherwise-empty queues.

*Why we didn't bother:* it doubles the code paths in the worker (a notify channel plus the dequeue loop) for at most ~250 ms of latency saving on an idle queue. Under load it makes no difference — workers are always behind the queue. The complexity is not justified by the gain. Adding NOTIFY later is a non-breaking change.

## 7. `last_error TEXT` instead of an error history table

`last_error` is a single column. When a job retries, the previous error is overwritten.

*What would change in production:* a `job_errors (job_id, attempt, error, recorded_at)` table appended-to inside the same transaction that calls `mark_failed_or_retry`. Useful for SLA-breach forensics. For a demo, the single column is the honest minimum and reviewers will understand.

## 8. No queue-depth gauge

Both the API and the worker expose Prometheus endpoints. The worker emits `worker_jobs_started_total{kind}`, `worker_jobs_completed_total{kind,outcome}`, and `worker_job_duration_seconds{kind,outcome}`. What is *not* exposed is a real-time queue-depth gauge — to read the count of `queued` / `retrying` rows you have to query Postgres directly.

*What would change in production:* a small periodic task that runs `SELECT status, COUNT(*) FROM jobs GROUP BY status` every 10 s and updates a `metrics::gauge!` would give operators a directly-alertable signal. The query cost is trivial because of the partial indices, but committing to it adds a polling cadence and a metric the operator has to understand. Skipped here to keep the surface small.

## 9. Migration tooling minimal

Migrations are forward-only and embedded into the migration runner. There is no down-migration path and no `db reset` helper.

*What would change in production:* either commit to forward-only (the norm for production schemas — most teams don't run down-migrations) and add a separate `clean.sql` for local dev, or use a tool like `refinery` that supports both. For a demo, forward-only matches industry practice.

## 10. The deterministic simulator's PRNG is non-cryptographic

`SimulatedExecutor::execute` uses `DefaultHasher` to derive a pseudo-random `[0, 1)` value from `(job_id, attempt, step)`. This is *not* a cryptographic RNG. It is more than sufficient for the demo (the goal is reproducibility, not unpredictability) and the choice is documented in [src/worker/executor.rs](../src/worker/executor.rs).

## 11. Throughput numbers are measured but reported scoped, not as headline claims

Two release-supported benches exist:

- [`benches/dequeue.rs`](../benches/dequeue.rs) — single-worker Criterion microbench of `fetch_next + mark_succeeded` overhead.
- [`benches/throughput.rs`](../benches/throughput.rs) — concurrent dequeue across {1, 2, 4, 8, 16} workers × {default, tuned, async_commit} Postgres configs. Results in [`bench/RESULTS.md`](../bench/RESULTS.md).
The exploratory head-to-head comparison benchmark is excluded from the v0.1 crates.io package until its methodology and teardown behavior are corrected.

All three results pages carry full hardware fingerprint, methodology, caveats, and interpretation. They are not promoted into the README as headline numbers, on principle: a number divorced from its environment outlives the conditions that produced it and starts misinforming readers. The committed results pages are where someone evaluating the system goes for numbers; the README points them there.

The 200-jobs-in-1.4s measurement from [`tests/concurrency_two_workers.rs`](../tests/concurrency_two_workers.rs) is correct but is not a fair throughput claim because its recording executor only sleeps 3 ms per job.

## 12. No graceful shutdown of the API's in-flight requests beyond axum's default

`axum::serve(...).with_graceful_shutdown(...)` is wired, but the request handlers themselves don't observe the cancellation token mid-handler. If a long-running query is in flight when SIGTERM arrives, axum waits for it to complete (default behavior) but doesn't pre-empt it. For a queue API where every handler is O(ms), this has never mattered.

---

## What was kept

The omissions above are the cost side of the ledger. On the other side, the artifact includes the requested queue shape plus a few production-oriented extras:

- **Cargo workspace → single-crate retrofit** with feature flags so consumers can pull in only what they need
- **Three-binary single-crate layout** with `required-features` gating, mirroring real service crates
- **Full crates.io metadata** (description, keywords, categories, repository, homepage, documentation, readme); `cargo package --list` verified in CI
- **CHECK constraints** in the migration so DB-level invariants are enforced
- **OpenAPI + Swagger UI** generated from utoipa derives + path annotations
- **Prometheus exporter** at `/metrics`
- **Tracing with request_id → job_id correlation** via `tower-http::request_id` plus a custom span builder
- **A startup recovery sweep** for stale-locked rows from previous shutdowns
- **A genuinely deterministic simulator** so the "fail then succeed on retry" path is reproducible in tests
- **Three feature-decoupling probes in CI** (`cargo check --no-default-features --features X`) so the publish-ready feature gating is checked
- **A 200-job / 8-worker concurrency test** that validates no duplicate processing in that harness by writing to a separate transactional log table

Each of those is a single small line on its own; together they're the difference between "compiles" and "production-shaped."