atproto-devtool 0.1.0

A multitool for the atproto developer ecosystem
Documentation
# test labeler

Last verified: 2026-04-15

## Purpose

Implements `atproto-devtool test labeler <target>`, a conformance suite that
validates an atproto labeler across four stages — identity, HTTP,
subscription, and crypto — and produces a structured report plus an exit
code (0 if all spec-required checks pass, 1 otherwise). Each stage is built
around an injected I/O seam so integration tests can replay fixtures
instead of talking to real servers.

## Contracts

- **Public entry points**:
  - `LabelerCmd::run(no_color) -> Result<ExitCode, miette::Report>` (in
    `labeler.rs`) — constructs the shared reqwest client and calls
    `pipeline::run_pipeline`.
  - `pipeline::parse_target(raw, explicit_did) -> LabelerTarget` — the
    accepted target grammar is frozen: handle, `did:*`, or `https://` URL
    (HTTP is rejected with a helpful error; raw endpoints with no DID
    simply skip identity/crypto).
  - `pipeline::run_pipeline(target, LabelerOptions) -> LabelerReport` — the
    one orchestrator that every test hits.
- **Per-stage entry points**: `identity::run`, `http::run`,
  `subscription::run`, `crypto::run`. Each returns a `*StageOutput` with an
  `Option<*Facts>` (populated only when the stage succeeds enough to let
  downstream stages run) plus a `Vec<CheckResult>`.
- **Report shape**: `report::{LabelerReport, CheckResult, CheckStatus,
  Stage, SummaryCounts, ReportHeader, RenderConfig}`. Five-way
  `CheckStatus`: `Pass`, `SpecViolation`, `NetworkError`, `Advisory`,
  `Skipped`. Exit code semantics: `1` if any `SpecViolation` is
  recorded; else `2` if any `NetworkError` is recorded; else `0`.
  `SpecViolation` takes precedence over `NetworkError` so that a
  conformance bug is never masked by an unrelated reachability
  failure. `Advisory` and `Skipped` never influence the exit code.
- **Check IDs are stable strings** (e.g. `"identity::target_resolved"`,
  `"http::first_page_decodes"`, `"crypto::rollup"`). They appear verbatim
  in insta snapshots under `tests/snapshots/`; renaming one is a breaking
  change to the CLI output contract.
- **Diagnostic codes are stable strings** (e.g.
  `"labeler::identity::labeler_endpoint_parseable"`). Same deal — snapshots
  pin them.

## Dependencies

- **Uses**: `crate::common::identity` for every network hop and DID
  primitive. `atrium-api` for labeler record + queryLabels types (we go
  through `serde_json` + atrium types, never through `atrium-xrpc-client`).
  `reqwest` and `tokio-tungstenite` only via the `RealHttpTee` and
  `RealWebSocketClient` seams.
- **Used by**: `crate::cli` wires this into the clap command tree; nothing
  else depends on it.
- **Boundary**: Stage modules talk to each other only through `*Facts`
  structs passed by `pipeline::run_pipeline`. A stage must not import
  another stage's internals.

## Key decisions

- **Every I/O boundary is a trait**: `HttpClient` + `DnsResolver` from
  `common::identity`, plus stage-local `RawHttpTee` (HTTP stage) and
  `WebSocketClient` / `FrameStream` (subscription stage). All four are
  injectable through `LabelerOptions`. The CLI passes real clients; tests
  pass fakes from `tests/common/mod.rs`.
- **Shared reqwest client**: `LabelerCmd::run` builds one reqwest client
  with rustls + 10s timeout + user-agent and threads it through every
  stage. Do not construct fresh clients inside stages.
- **Two-connection subscription strategy**: the subscription stage tries
  to observe backfill via an idle-gap heuristic, then on `ExceededBudget`
  or `StreamClosedDuringBackfill` reconnects live-tail to distinguish a
  healthy long backfill from a stuck stream. Outcome is captured in the
  `BackfillOutcome` / `LiveTailOutcome` enums.
- **Crypto stage falls back to PLC history**: if the current signing key
  fails to verify a label, `did:plc` targets retry against historic keys
  from the PLC audit log (`plc_history_for_fragment`). `did:web` has no
  history, so a failure there is a hard `SpecViolation`. Verification that
  only succeeds against a historic key still passes the stage but emits an
  `Advisory`.
- **DRISL-CBOR canonicalization for label signing**: crypto stage
  implements the deterministic CBOR canonicalization in
  `canonicalize_label_for_signing` rather than pulling a library — label
  signing uses a specific sort order and tag encoding that no existing
  crate matches. See `crypto.rs` for the spec.
- **Every check always emits a result**: stages never short-circuit on the
  first failure. When a prerequisite fails, downstream checks in the same
  stage still emit `Skipped` rows with a reason. When a
  whole stage is blocked upstream, the pipeline emits one
  `<stage>::not_run` row per downstream stage with a reason string.
- **Identity facts gate downstream stages**: HTTP and crypto stages only
  run when identity populated `IdentityFacts`. Subscription can run from
  an explicit endpoint URL without identity (the
  `LabelerTarget::Endpoint { did: None }` path).
- **Crypto pulls labels from HTTP and/or subscription**: the crypto stage
  runs when identity succeeded and *either* the HTTP stage produced
  `HttpFacts` *or* the subscription stage collected at least one
  `sample_labels` entry. Labels from both sources are concatenated before
  verification so a JSON-decoded `queryLabels` page and a CBOR-decoded
  `subscribeLabels` frame are both exercised. Subscription samples are
  capped at `subscription::SAMPLE_LABEL_CAP` to bound memory on noisy
  streams.

## Invariants

- Every `CheckResult` with `status == SpecViolation` carries a `diagnostic`
  with a non-empty `#[source_code]` — the report renderer uses miette's
  `GraphicalReportHandler` and a missing source span degrades output.
- `LabelerReport::exit_code` returns `1` if any `SpecViolation` is
  recorded, `2` if not but at least one `NetworkError` is recorded,
  and `0` otherwise. Advisories and skipped checks never fail the run.
- Snapshot tests under `tests/snapshots/` are part of the contract. Any
  check ID, diagnostic code, or rendered line change must be accompanied
  by a reviewed `cargo insta review`.
- The pipeline never calls `reqwest::Client::new()` or constructs a
  tokio-tungstenite connection outside of `Real*` seam structs.

## Key files

- `labeler.rs` — clap args, `LabelerCmd::run`, CLI bootstrap.
- `pipeline.rs``LabelerTarget`, `LabelerOptions`, `parse_target`,
  `run_pipeline` orchestration.
- `report.rs``CheckStatus`, `CheckResult`, `LabelerReport`,
  `RenderConfig`, rendering via `miette::GraphicalReportHandler`.
- `identity.rs` — identity stage: DID resolution, labeler record fetch
  (through `atrium-api` types over the `HttpClient` seam), policy
  validation.
- `http.rs` — HTTP stage: `RawHttpTee` trait, `RealHttpTee` reqwest
  implementation, first-page / pagination / cursor checks against
  `com.atproto.label.queryLabels`.
- `subscription.rs` — WebSocket stage: `WebSocketClient` / `FrameStream`
  traits, CBOR frame decoder, two-connection backfill / live-tail logic.
- `crypto.rs` — label canonicalization, signature verification, PLC key
  history fallback.

## Gotchas

- `LabelerTarget::Endpoint { did: None }` runs HTTP and subscription but
  skips identity and crypto. Emitting those as "blocked" rather than
  "skipped — no DID supplied" is a regression.
- `RawHttpTee::query_labels(cursor)` must NOT duplicate the first-page
  request for reachability — the stage previously pinged before the real
  request, doubling traffic against real servers.
- `CheckResult::diagnostic` is `Option<Box<dyn Diagnostic + Send + Sync>>`
  — when you add a new failure case, wire the diagnostic through the
  whole way. Snapshots will expose "diagnostic: None" as a rendered
  blank block if you forget.
- The CLI pipes through `tracing` with `EnvFilter``--verbose` toggles
  `DEBUG`. Per-stage instrumentation is load-bearing for the
  `verbose_flag_accepted` CLI test.
- Fixture layout under `tests/fixtures/labeler/<stage>/<case>/` is
  referenced by test helper `gen_fixtures` anchored to
  `CARGO_MANIFEST_DIR`. Empty case directories need a `.gitkeep`.