api-debug-lab 0.4.0

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "intro",
   "metadata": {},
   "source": [
    "# API Support Debug Lab — Colab walkthrough\n",
    "\n",
    "A reproducible developer-support debugging lab written in Rust. This notebook walks the crate end-to-end on a fresh Colab VM in roughly **two minutes** after the Rust toolchain installs.\n",
    "\n",
    "**What you'll see.** Eight diagnostic rules. Fourteen bundled positive fixtures plus eleven negatives. Three real-API webhook envelopes (Stripe v1, Slack v0, GitHub HMAC). A Brier-calibrated confidence model with a regression canary. Ninety-plus tests across the library, CLI, schema, calibration, snapshot, property, oracle, and latency suites, all green.\n",
    "\n",
    "**What this is not.** A web service. A live API. A learned classifier. The lab is local, offline, rule-based, and honest about every claim it makes.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "code-map-and-run-guide",
   "metadata": {},
   "source": [
    "## Code map and run guide\n",
    "\n",
    "The repository has two layers: a small diagnostic library and a thin CLI wrapper.\n",
    "\n",
    "- `src/cases.rs` defines the `Case` data model, loads `case.json`, discovers sibling `server.log` / `secret.txt` files, validates the on-disk layout through tests, and falls back to embedded fixtures when the binary is installed without a repository checkout.\n",
    "- `src/rules.rs` is the rule engine. It registers eight deterministic rules, parses text or JSON-lines logs, recomputes HMAC-SHA256 signatures, derives retry elapsed time from RFC3339 timestamps, compares idempotency body hashes, and sorts diagnoses by confidence with deterministic tie-breaking.\n",
    "- `src/evidence.rs` keeps each evidence item tied to a source pointer such as `request.headers.authorization` or a log line number.\n",
    "- `src/report.rs` renders the same diagnosis as human text or JSON and builds byte-stable curl reproductions.\n",
    "- `src/main.rs` is the CLI: `list-cases`, `diagnose`, `explain`, `replay`, `report`, and `corpus`.\n",
    "- `fixtures/cases/` contains positive fixtures. `_negatives/` contains lookalike cases that must remain unclassified. `_calibration/` adds labelled edge cases for confidence calibration.\n",
    "- `tests/` covers rule behavior, schema validation, CLI behavior, snapshots, property tests, HMAC oracle vectors, calibration metrics, and latency budget.\n",
    "\n",
    "Installed from crates.io, the bundled fixtures are embedded in the binary:\n",
    "\n",
    "```bash\n",
    "cargo install api-debug-lab\n",
    "api-debug-lab list-cases\n",
    "api-debug-lab diagnose auth_missing\n",
    "api-debug-lab diagnose webhook_signature_invalid_stale --format json | jq\n",
    "```\n",
    "\n",
    "From a source checkout, use Cargo directly and point `corpus` at the fixture tree:\n",
    "\n",
    "```bash\n",
    "cargo run -- list-cases\n",
    "cargo run -- diagnose auth_missing\n",
    "cargo run -- corpus fixtures/cases | tail -25\n",
    "cargo test\n",
    "```\n",
    "\n",
    "To diagnose your own captures, create the same directory shape (`case.json`, optional `server.log`, optional `secret.txt`) and pass either a fixture root or a direct path:\n",
    "\n",
    "```bash\n",
    "api-debug-lab --fixtures /path/to/fixtures diagnose my_case\n",
    "api-debug-lab diagnose /path/to/my_case/case.json\n",
    "api-debug-lab corpus /path/to/case-directory\n",
    "```\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "setup-rust",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Install a minimal Rust toolchain. ~90 s on Colab the first time.\n",
    "# `--profile minimal` skips docs and the rust-src component we don't need here.\n",
    "!curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --profile minimal --default-toolchain stable\n",
    "import os\n",
    "os.environ['PATH'] = f\"{os.environ['HOME']}/.cargo/bin:{os.environ['PATH']}\"\n",
    "!rustc --version && cargo --version"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "clone-repo",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Clone the repo. Requires the repo to be reachable at this URL.\n",
    "!git clone --depth 1 https://github.com/infinityabundance/api-support-debug-lab.git\n",
    "%cd api-support-debug-lab"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "build-release",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Release build so the demo runs fast (~30 s wall-clock on Colab).\n",
    "# `tail -5` keeps the output cell tight; the build itself is uneventful.\n",
    "!cargo build --release 2>&1 | tail -5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "list-cases",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Fourteen bundled positive fixtures, one per directory under fixtures/cases/.\n",
    "# Negatives and the _calibration set are hidden from this listing by convention.\n",
    "!./target/release/api-debug-lab list-cases\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "diagnose-auth-missing",
   "metadata": {},
   "outputs": [],
   "source": [
    "# The money shot: the human report a support engineer would paste into a ticket.\n",
    "# Notice the EVIDENCE block (three structural signals, not just '401'), the\n",
    "# REPRODUCTION block (a deterministic curl command), the NEXT STEPS block,\n",
    "# and the ESCALATION NOTE block that names the divergence space.\n",
    "!./target/release/api-debug-lab diagnose auth_missing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "diagnose-arbitration",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Two rules fire on this case: webhook_signature_mismatch (HMAC differs) AND\n",
    "# webhook_timestamp_stale (drift outside tolerance). The orchestrator ranks\n",
    "# them by confidence — HMAC mismatch is dispositive evidence, timestamp\n",
    "# drift has benign causes — and surfaces the loser as ALSO CONSIDERED.\n",
    "!./target/release/api-debug-lab diagnose webhook_signature_invalid_stale"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "diagnose-slack-envelope",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Slack-style envelope: `X-Slack-Signature: v0=<hex>` over the signing input\n",
    "# `\"v0:{timestamp}:{body}\"`. The rule recomputes the HMAC against the\n",
    "# bundled secret and reports the mismatch. The two other supported envelopes\n",
    "# (Stripe v1, GitHub HMAC) are exercised by their own fixtures.\n",
    "!./target/release/api-debug-lab diagnose webhook_slack_v0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "corpus-sweep",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Sweep every case.json under fixtures/cases/ (positives, negatives, and\n",
    "# the _calibration edge cases). Exit code is non-zero if any case is\n",
    "# unclassified — useful as a regression check when adding rules.\n",
    "!./target/release/api-debug-lab corpus fixtures/cases | tail -25"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "tests",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Full test suite: per-rule unit tests, schema validation, CLI integration\n",
    "# via assert_cmd, snapshot tests via insta, property-based tests via proptest,\n",
    "# calibration (aggregate Brier, per-rule Brier, ECE), oracle differential\n",
    "# tests against externally-computed HMAC references, and a per-rule latency\n",
    "# budget. ~5 s on Colab.\n",
    "!cargo test --tests 2>&1 | grep 'test result'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "calibration-detail",
   "metadata": {},
   "outputs": [],
   "source": [
    "# The calibration test enforces five empirical properties over the 36-case\n",
    "# labelled corpus: 100% primary-classification accuracy, aggregate Brier\n",
    "# ≤ 0.05, per-rule Brier ≤ 0.08, ECE ≤ 0.05, and zero confidence on\n",
    "# unclassified cases. The rubric is documented in docs/confidence_model.md.\n",
    "!cargo test --test calibration 2>&1 | tail -15"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "deeper-rigour-intro",
   "metadata": {},
   "source": [
    "## Deeper rigour cells\n",
    "\n",
    "These are normal code cells, so Colab **Run all** executes them. They add several minutes because they install extra Cargo tools and run heavier empirical checks. Measured baseline numbers live in `docs/mutation_report.md`, `docs/coverage.md`, and `docs/benchmarks.md`; the confidence rubric is in `docs/confidence_model.md`; architecture decisions live under `docs/adr/`.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "mutation-testing",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Mutation testing: validates that tests catch rule-level behavioral changes.\n",
    "# Baseline report: 91% kill rate over 169 viable mutants in src/rules.rs.\n",
    "!cargo install --locked cargo-mutants\n",
    "!cargo mutants --in-place --file src/rules.rs --no-shuffle --timeout-multiplier=2\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "coverage-summary",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Coverage summary over the full default test suite.\n",
    "# Baseline report: ~92% regions and ~93% lines.\n",
    "!cargo install --locked cargo-llvm-cov\n",
    "!rustup component add llvm-tools-preview\n",
    "!cargo llvm-cov --summary-only\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "benchmark-quick",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Microsecond per-case benchmark smoke.\n",
    "!cargo bench --bench diagnose -- --quick 2>&1 | grep 'time:'\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "calibration-canary",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Calibration regression canary: injects a deliberately miscalibrated rule\n",
    "# and asserts the production Brier check would catch it.\n",
    "!cargo test --features calibration_canary --test calibration_regression\n"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "name": "python3"
  },
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}