atproto-devtool 0.1.1

# common

Last verified: 2026-04-19

## Purpose

Narrow, mockable primitives shared by every labeler conformance stage.
`common::identity` exists so that every network hop — DNS, HTTPS, PLC
directory lookups, DID document fetches, labeler record fetches — can be
swapped with a recorded fixture in integration tests. It also owns the
signing/verifying key newtypes (`AnySigningKey`, `AnyVerifyingKey`,
`AnySignature`) that every curve-generic operation routes through.
`common::jwt` holds a minimal hand-rolled compact JWS encoder/decoder
used by the report stage to mint self-mint service-auth tokens without
pulling a JWT library. `common::diagnostics` holds the miette
`NamedSource`/`SourceSpan` helpers used when attaching JSON source
context to check failures.

## Contracts

- **Exposes from `identity`**:
  - Traits: `HttpClient` (`get_bytes(&Url) -> (u16, Vec<u8>)`),
    `DnsResolver` (`txt_lookup(&str) -> Vec<String>`). These are the only
    sanctioned seams for network I/O in the whole crate.
  - Real implementations: `RealHttpClient` (wraps `reqwest::Client`,
    constructible from a shared client via `from_client` so stages can reuse
    one TLS pool) and `RealDnsResolver` (hickory).
  - Types: `Did`, `DidMethod`, `DidDocument`, `RawDidDocument`, `Service`,
    `VerificationMethod`, `Curve`, `AnyVerifyingKey`, `AnySigningKey`,
    `AnySignature`, `ParsedMultikey`, `PlcHistoricKey`.
  - Resolvers: `resolve_handle`, `resolve_did`, `find_service`,
    `parse_multikey`, `encode_multikey`, `plc_history_for_fragment`.
  - Classification: `is_local_labeler_hostname(&Url) -> bool` — returns
    `true` for loopback, `.local`, and RFC 1918 IPv4 addresses. Drives
    the report stage's self-mint viability check.
  - `IdentityError` — single error enum covering every resolution failure.
    Variants are matched on by the identity stage to emit distinct check
    results, so adding or removing variants is a contract change.
- **Signing API**:
  - `AnySigningKey::{K256, P256}` mirror `AnyVerifyingKey` for the
    signing side. `sign(msg)` and `sign_prehash(&[u8; 32])` return
    `AnySignature`. All signatures are low-s normalized — the k256
    backend does this automatically; the p256 backend normalizes
    explicitly because p256's `sign_prehash` can return high-s.
  - `AnySigningKey::verifying_key()` returns the paired `AnyVerifyingKey`.
  - `AnySigningKey::jwt_alg()` returns `"ES256K"` or `"ES256"`.
  - `AnySignature::to_jws_bytes() -> [u8; 64]` serializes `r || s`
    big-endian (JWS raw-signature form, NOT DER).
  - `encode_multikey(&AnyVerifyingKey) -> String` is the exact inverse
    of `parse_multikey`: base58btc multibase with the multicodec curve
    prefix (`0xe701` for secp256k1, `0x8024` for P-256) followed by the
    compressed SEC1 point.
- **Exposes from `jwt`**:
  - Types: `JwtHeader`, `JwtClaims`, `JwtError`. Field names
    (`alg`, `typ`, `iss`, `aud`, `exp`, `iat`, `lxm`, `jti`) are the
    exact JSON keys atproto labelers expect — do NOT rename without
    adding `#[serde(rename = "...")]`.
  - Functions: `encode_compact(&header, &claims, &AnySigningKey)` for
    producing compact-form tokens, `decode_compact(token)` for parsing,
    `verify_compact(token, &AnyVerifyingKey)` for end-to-end verify.
  - Only ES256 and ES256K are supported. `nbf` is deliberately omitted
    — the atproto spec does not require it and some servers reject
    unexpected claims.
  - `JwtError` does NOT derive `miette::Diagnostic` with stable codes.
    It is surfaced only inside the stage, which wraps any failure in a
    stage-local diagnostic with a `labeler::report::*` code before
    rendering.
- **Guarantees**:
  - `resolve_did` returns `RawDidDocument` with the original bytes retained
    in an `Arc<[u8]>` so downstream stages can build `NamedSource`
    diagnostics that point back at the unmodified server response.
  - `parse_multikey` accepts only `did:key` / multibase-`z` prefixed inputs
    and rejects unknown codec prefixes, wrong lengths, and mismatched
    curves. Never panics on malformed input.
  - `find_service` matches on the trailing `#fragment` of `Service::id`
    anchored to the end of the string (not a substring search).
  - `plc_history_for_fragment` returns the set of distinct keys from the
    PLC audit log for a given fragment, deduplicated by multikey string
    (keeping the earliest introduction). Order is chronological
    (oldest-first, matching the PLC API wire order), but the crypto
    stage treats the result as a set.
- **Expects**: Callers supply a `&dyn HttpClient` / `&dyn DnsResolver`
  rather than constructing their own reqwest/hickory instances. The CLI
  wires the real clients once in `LabelerCmd::run` and passes them through
  `LabelerOptions`.

## Dependencies

- **Uses**: `reqwest` (rustls, json, gzip), `hickory-resolver`, `k256`,
  `p256`, `multibase`, `sha2`, `serde_json`, `miette`, `thiserror`,
  `url`. `common::jwt` additionally uses `base64` (URL_SAFE_NO_PAD
  engine) and `serde`.
- **Used by**: every stage in `commands/test/labeler/`, plus integration
  tests under `tests/`. `common::jwt` is currently only used by the
  report stage.
- **Boundary**: `common::identity` and `common::jwt` must not depend on
  anything under `commands/`. Stage-specific types (`IdentityFacts`,
  `HttpFacts`, etc.) live next to their stage, not here.

## Key decisions

- **Narrow trait seams, not `reqwest::Client` everywhere**: Every previous
  refactor that tried to pass `reqwest::Client` directly eventually broke
  integration tests. The `HttpClient` trait's two-method surface is
  deliberately small so fakes are trivial to write.
- **`Arc<[u8]>` for source bytes**: miette `NamedSource` wants owned bytes
  and we fan the same payload out to multiple diagnostics, so every raw
  response is stored as `Arc<[u8]>`.
- **Single `IdentityError` enum**: We tried split-per-stage errors and it
  forced lossy conversions. One enum, matched exhaustively by the identity
  stage, is less code and produces better diagnostics.
- **`did:plc` percent-encoding**: `resolve_did` percent-encodes the DID
  before building the PLC directory URL. Do not regress.
- **No `#[serde(flatten)]` / `#[serde(untagged)]`**: Required by project
  conventions; all DID document types use explicit `#[serde(rename)]`.
- **All signatures low-s normalized**: `AnySigningKey::sign` guarantees
  low-s form for both curves so `AnyVerifyingKey::verify_prehash` round-trip
  always succeeds. atproto requires low-s; p256's backend does not enforce
  it, so we explicitly call `normalize_s`.
- **Hand-rolled JWT instead of a library**: We only need compact JWS with
  ES256/ES256K for a handful of tightly-scoped report-stage tokens. A full
  JWT library would pull RSA, HMAC, JWE, and a JSON Schema validator we do
  not want. The module is <500 lines and fully covered by round-trip
  tests.
- **`is_local_labeler_hostname` is deliberately conservative**: IPv6
  private ranges (`fc00::/7`, link-local) are NOT classified as local in
  v1. Operators running labelers on IPv6 ULA must pass `--force-self-mint`.

## Invariants

- A `RawDidDocument` always has `source_bytes` matching exactly what the
  server returned — never pretty-printed, never re-serialized.
- `AnyVerifyingKey::verify_prehash` rejects curve mismatches rather than
  silently returning `Ok(())`.
- `HttpClient::get_bytes` returns the HTTP status even for non-2xx responses
  rather than converting them to errors; callers decide what a non-200
  means in context.
- `AnySignature::to_jws_bytes()` is always exactly 64 bytes, for both
  curves.
- `encode_multikey(parse_multikey(s).verifying_key) == s` for every
  well-formed atproto multikey — round-tripping is pinned by unit tests.
- `jwt::verify_compact` accepts exactly three `.`-separated segments.
  Four-segment (JWE) or malformed inputs return `JwtError::MalformedCompact`.

## Key files

- `identity.rs` — all resolvers, DID types, signing/verifying key
  newtypes, `encode_multikey` / `parse_multikey`,
  `is_local_labeler_hostname`, plus extensive unit tests at the bottom.
- `jwt.rs` — compact JWS encoder/decoder for ES256 and ES256K:
  `JwtHeader`, `JwtClaims`, `JwtError`, `encode_compact`,
  `decode_compact`, `verify_compact`. Segments use unpadded base64url;
  signatures are raw `r || s` (not DER) per RFC 7518 §3.4.
- `diagnostics.rs` — `install_miette_handler`, `named_source_from_bytes` /
  `named_source_from_str`, plus the JSON display helpers:
  `pretty_json_for_display` (re-serialize a JSON body so miette's caret
  rendering has newlines to land on), `span_at_line_column` (convert a
  `serde_json::Error` `(line, column)` into a `SourceSpan`, clamped to the
  matched line), and `span_for_quoted_literal` (find the span of a quoted
  JSON key or string value). Every stage that attaches JSON source context
  to a diagnostic should pretty-print the body once up front and compute
  spans against that same pretty body — never mix raw and pretty.

## Gotchas

- `RealHttpClient::new()` builds a fresh `reqwest::Client` with a 10s
  timeout and a User-Agent header. Production code should prefer
  `RealHttpClient::from_client` so identity, HTTP, and crypto stages all
  share one connection pool.
- `resolve_handle` has a DNS-first / HTTPS-fallback order and the HTTPS
  fallback must send a User-Agent. Tests exercise both paths.
- `plc_history_for_fragment` traverses the PLC audit log in wire order
  (oldest-first) and dedupes by multikey string, not by position — the
  same key appearing across multiple rotations collapses to a single
  `PlcHistoricKey` (keeping the earliest introduction's metadata).
- `AnySigningKey` nonces (`jti` in JWTs, run-id in sentinels) come from
  `getrandom::getrandom`, not from `rand`. The crate is a direct dep
  because the transitive `rand_core` in `elliptic-curve` is built
  without the `getrandom` feature.
- p256 `sign_prehash` returns signatures that may be high-s; always go
  through `AnySigningKey::sign` / `sign_prehash` rather than calling the
  backend trait directly, or low-s normalization will be skipped and
  atproto servers will reject the signature.