API Support Debug Lab
A reproducible developer-support debugging lab. The repository contains intentionally-failing API request fixtures (request, response, server log) and a Rust diagnostic CLI that classifies the failure, recomputes the relevant evidence, and emits next steps and a draft escalation note.
The artefact answers a single question: given the same fixtures a real support engineer might receive, how does the candidate reproduce, classify, and communicate?
At a glance
$ api-debug-lab list-cases | wc -l # bundled positive fixtures
14
$ cargo test --tests 2>&1 | grep "test result: ok" | wc -l
9 # nine independent test groups, all green
$ cargo llvm-cov --summary-only | tail -1
TOTAL ... 92.38 % regions covered
$ cargo mutants --in-place --file src/rules.rs --no-shuffle | tail -1
183 mutants tested in 3m: 15 missed, 154 caught, 14 unviable # 91 % kill rate
Eight rules, fourteen bundled positive fixtures, eleven bundled negative fixtures, three real-API webhook envelopes (Stripe v1, Slack v0, GitHub HMAC), a 36-case Brier-calibrated confidence corpus with a regression canary, oracle HMAC tests pinned to externally-computed reference vectors, single-digit-microsecond per-case latency, 93 % line coverage, 91 % mutation kill rate, ~90 tests across nine groups, three ADRs documenting the design choices.
Money shot
$ api-debug-lab diagnose auth_missing
CASE: auth_missing
SEVERITY: medium
LIKELY CAUSE: Missing Authorization header
CONFIDENCE: 0.95
RULE: auth_missing
EVIDENCE:
- Authorization header absent in request
- Endpoint POST https://api.acme-co.example/v1/events flagged auth_required=true
- Response status 401 Unauthorized
REPRODUCTION:
curl -X POST https://api.acme-co.example/v1/events \
-H "content-type: application/json" \
-H "user-agent: acme-client/0.4.1" \
--data-raw '{"event":"order.created","order_id":"ord_8KZ"}'
NEXT STEPS:
1. Add an Authorization: Bearer <token> header to the request.
2. Confirm the token has not expired.
3. Verify the token's scope covers the requested operation.
ESCALATION NOTE:
Customer request failed because the Authorization header was absent.
The API rejected the request before payload processing. Ask the customer
to retry with a valid bearer token and confirm the token's scope.
Arbitration in action
When more than one rule fires, the highest-confidence diagnosis is
reported as primary and the rest as also_considered. Tie-breaks are
alphabetical on rule_id so output is byte-stable.
$ api-debug-lab diagnose webhook_signature_invalid_stale
LIKELY CAUSE: Webhook signature does not match recomputed HMAC
CONFIDENCE: 0.92
RULE: webhook_signature_mismatch
...
ALSO CONSIDERED:
- webhook_timestamp_stale (confidence 0.90): Webhook timestamp outside tolerance window
HMAC mismatch is rated higher than timestamp drift because a wrong digest is dispositive (secret, body, or timestamp prefix differs) while clock skew has benign causes. The confidence rubric is documented in docs/confidence_model.md.
What this demonstrates
- API troubleshooting against fixtures that look like real support tickets.
- Log + header inspection with structural checks (HMAC recompute, JSON parse, rate-limit header math, hostname Hamming distance, idempotency body-hash comparison) — not just status-code grep.
- Confidence-ranked arbitration when multiple rules fire, with a documented confidence model and a Brier-score calibration test.
- Production realism: Stripe-style multi-version webhook envelopes, JSON-lines log auto-detection, RFC3339 timestamp derivation, partial- outage interleaved request streams.
- A Rust CLI with
clapderive, snapshot tests viainsta, property-based tests viaproptest, end-to-end CLI tests viaassert_cmd, andcriterionbenchmarks. - Honest scope: rule-based, eight failure modes, no machine learning, no network calls, no telemetry.
Quick start
|
Requires a stable Rust toolchain (1.78+). No service to start, no Docker.
The bundled fixtures are embedded in the installed binary; pass
--fixtures <dir> to diagnose a local fixture directory instead.
From a source checkout:
|
Failure cases
| Case | Structural signal beyond status code |
|---|---|
auth_missing |
Authorization header literally absent on an auth_required route |
bad_json_payload |
serde_json parser error and byte offset on the request body |
rate_limited |
Retry-After, X-RateLimit-Remaining, X-RateLimit-Reset parsed |
webhook_signature_invalid |
HMAC-SHA256 recomputed over "{ts}.{body}" and compared |
webhook_signature_invalid_stale |
Ambiguous: signature mismatch and timestamp drift; both rules fire |
webhook_stripe_v1 |
Stripe envelope t=,v1=,v0= parsed; multi-version HMAC compare |
webhook_slack_v0 |
Slack envelope v0=; HMAC over "v0:{ts}:{body}" |
webhook_github_hmac |
GitHub sha256=; HMAC over the raw body (no timestamp) |
timeout_retry |
Log lines grouped by request id; elapsed derived from RFC3339 stamps |
timeout_retry_jsonl |
Same shape but server log is JSON-lines (per-line auto-detect) |
timeout_retry_partial_outage |
Two request_ids interleaved; rule isolates the worst offender |
timeout_retry_midnight_rollover |
Log spans midnight UTC; elapsed derived correctly across day boundary |
config_dns_error |
URL host parsed; near-miss (Hamming-distance) and TLD checks |
idempotency_collision |
Recomputed body SHA-256 compared against the stored hash |
Negatives that look similar but should not classify live under
fixtures/cases/_negatives/ — one per rule,
including a webhook_clean with a valid HMAC, a webhook_stripe_v1_clean
with a valid Stripe v1 envelope, a 401 from an unrelated upstream call,
and an idempotency-clean retry with byte-identical body.
CLI
api-debug-lab list-cases # bundled fixtures
api-debug-lab diagnose <name|path> # human report
api-debug-lab diagnose <name> --format json # machine-readable
api-debug-lab diagnose <name> --trace # per-rule timing on stderr
api-debug-lab explain <name> # rule + evidence pointers
api-debug-lab replay <name> # curl repro + diagnosis
api-debug-lab report <name> # alias for human diagnose
api-debug-lab corpus <dir> # sweep an arbitrary dir
api-debug-lab corpus <dir> --ndjson # one JSON object per line
Exit codes: 0 diagnosed (confidence ≥ 0.60), 1 unclassified or
low-confidence, 2 bad input.
Architecture
fixtures/cases/<name>/case.json → Case (serde, schema-validated) ┐
fixtures/cases/<name>/server.log → &str (lazy load, JSONL or text) ├→ Rule[] → Diagnosis[] → Report
fixtures/cases/<name>/secret.txt → Vec<u8> ┘ (sorted by
confidence desc)
- src/cases.rs —
Case, loader, schema-validated by fixtures/cases.schema.json (JSON Schema Draft 2020-12). - src/rules.rs —
Ruletrait, eight rule impls,diagnose,diagnose_traced, per-lineLogLineJSONL/text autodetect, RFC3339 timestamp derivation. - src/evidence.rs —
Evidence+Pointer(source + optional log line). - src/report.rs — human / JSON formatters; deterministic curl reproduction.
- src/main.rs —
clapsubcommands, exit codes,--trace,corpussweep.
All output is byte-stable across machines: header iteration uses
BTreeMap, reproductions inline the body (no absolute paths), no
system clock or RNG.
Tests and benchmarks
Test coverage:
- Per-rule unit tests (tests/rules.rs) — positive + paired negative for each rule.
- Snapshot tests (tests/snapshots.rs) — human
and JSON renders of every fixture, pinned via
insta. - Property-based tests (tests/properties.rs)
— proptest invariants: no rule panics on any schema-valid case;
diagnose is idempotent; confidence is finite and in [0, 1];
also_consideredis sorted descending and below the primary; hand-written adversarial fixtures (1 MiB body, NUL byte, far-future timestamp, extreme URL). - Schema validation (tests/schema.rs) — every
bundled
case.jsonvalidates against fixtures/cases.schema.json. - CLI integration (tests/diagnose_cli.rs)
—
assert_cmdend-to-end for each subcommand, including a tempdir test forcorpusover a copied fixture. - Confidence calibration (tests/calibration.rs)
— five distinct properties over the labelled corpus
(
expected_rule_idlabels embedded in each enrolledcase.json, 36 cases × 8 rules = 288 (case, rule) pairs): aggregate Brier ≤ 0.05, per-rule Brier ≤ 0.08, ECE ≤ 0.05, 100 % primary-classification accuracy, unclassified cases below threshold. Rubric in docs/confidence_model.md. - Calibration regression canary
(tests/calibration_regression.rs)
— a feature-gated test (
--features calibration_canary) that simulates a deliberately-miscalibrated rule and asserts the production Brier check would have caught it. Runs in a separate CI job. Proves the calibration framework is load-bearing rather than ceremonial. - Oracle / differential HMAC tests
(tests/oracle.rs) — three reference signatures
hand-computed via
openssland pinned in source. Catches future signing-input drift that self-consistent round-trip tests would miss. - Latency-budget regression test (tests/latency_budget.rs) — asserts per-rule median wall-clock evaluation is below 100 µs across 200 iterations per fixture.
- Mutation testing report
(docs/mutation_report.md) — 91 % kill
rate over 169 viable mutants in
src/rules.rs, surviving mutants classified (gap / benign / equivalent). - Code coverage snapshot
(docs/coverage.md) — 92.4 % regions, 91.7 %
functions, 93.0 % lines via
cargo-llvm-cov, date- and rustc-version-stamped.
Snapshot updates after intentional output changes:
Benchmarks run in --quick mode under a second; full mode under ten
seconds. Baseline numbers and methodology in
docs/benchmarks.md.
Limitations and scope
This is a demo / portfolio artefact. Honest limits:
- Rule-based, not learned. Eight hand-written rules; no ML, no model.
The confidence rubric is documented in
docs/confidence_model.md and held
accountable by aggregate Brier, per-rule Brier, ECE, and a
feature-gated regression canary in
tests/calibration*.rs. - Eight failure modes. Real support traffic has many more; the rule set is illustrative.
- Synthetic fixtures. Modelled on the shape of production logs and the documented envelope formats of Stripe v1, Slack v0, and GitHub HMAC, but invented; no real customer data.
- No network. Reproductions print
curlfor the reviewer to run by hand; the binary itself never opens a socket. - No background service. The lab is the binary plus the fixtures.
License
Apache-2.0. See LICENSE.