atproto-devtool 0.1.0

A multitool for the atproto developer ecosystem
Documentation
# common

Last verified: 2026-04-15

## Purpose

Narrow, mockable primitives shared by every labeler conformance stage.
`common::identity` exists so that every network hop — DNS, HTTPS, PLC
directory lookups, DID document fetches, labeler record fetches — can be
swapped with a recorded fixture in integration tests. `common::diagnostics`
holds the miette `NamedSource`/`SourceSpan` helpers used when attaching JSON
source context to check failures.

## Contracts

- **Exposes from `identity`**:
  - Traits: `HttpClient` (`get_bytes(&Url) -> (u16, Vec<u8>)`),
    `DnsResolver` (`txt_lookup(&str) -> Vec<String>`). These are the only
    sanctioned seams for network I/O in the whole crate.
  - Real implementations: `RealHttpClient` (wraps `reqwest::Client`,
    constructible from a shared client via `from_client` so stages can reuse
    one TLS pool) and `RealDnsResolver` (hickory).
  - Types: `Did`, `DidMethod`, `DidDocument`, `RawDidDocument`, `Service`,
    `VerificationMethod`, `Curve`, `AnyVerifyingKey`, `AnySignature`,
    `ParsedMultikey`, `PlcHistoricKey`.
  - Resolvers: `resolve_handle`, `resolve_did`, `find_service`,
    `parse_multikey`, `plc_history_for_fragment`.
  - `IdentityError` — single error enum covering every resolution failure.
    Variants are matched on by the identity stage to emit distinct check
    results, so adding or removing variants is a contract change.
- **Guarantees**:
  - `resolve_did` returns `RawDidDocument` with the original bytes retained
    in an `Arc<[u8]>` so downstream stages can build `NamedSource`
    diagnostics that point back at the unmodified server response.
  - `parse_multikey` accepts only `did:key` / multibase-`z` prefixed inputs
    and rejects unknown codec prefixes, wrong lengths, and mismatched
    curves. Never panics on malformed input.
  - `find_service` matches on the trailing `#fragment` of `Service::id`
    anchored to the end of the string (not a substring search).
  - `plc_history_for_fragment` returns the set of distinct keys from the
    PLC audit log for a given fragment, deduplicated by multikey string
    (keeping the earliest introduction). Order is chronological
    (oldest-first, matching the PLC API wire order), but the crypto
    stage treats the result as a set.
- **Expects**: Callers supply a `&dyn HttpClient` / `&dyn DnsResolver`
  rather than constructing their own reqwest/hickory instances. The CLI
  wires the real clients once in `LabelerCmd::run` and passes them through
  `LabelerOptions`.

## Dependencies

- **Uses**: `reqwest` (rustls, json, gzip), `hickory-resolver`, `k256`,
  `p256`, `multibase`, `sha2`, `serde_json`, `miette`, `thiserror`.
- **Used by**: every stage in `commands/test/labeler/`, plus integration
  tests under `tests/`.
- **Boundary**: `common::identity` must not depend on anything under
  `commands/`. Stage-specific types (`IdentityFacts`, `HttpFacts`, etc.)
  live next to their stage, not here.

## Key decisions

- **Narrow trait seams, not `reqwest::Client` everywhere**: Every previous
  refactor that tried to pass `reqwest::Client` directly eventually broke
  integration tests. The `HttpClient` trait's two-method surface is
  deliberately small so fakes are trivial to write.
- **`Arc<[u8]>` for source bytes**: miette `NamedSource` wants owned bytes
  and we fan the same payload out to multiple diagnostics, so every raw
  response is stored as `Arc<[u8]>`.
- **Single `IdentityError` enum**: We tried split-per-stage errors and it
  forced lossy conversions. One enum, matched exhaustively by the identity
  stage, is less code and produces better diagnostics.
- **`did:plc` percent-encoding**: `resolve_did` percent-encodes the DID
  before building the PLC directory URL. Do not regress.
- **No `#[serde(flatten)]` / `#[serde(untagged)]`**: Required by project
  conventions; all DID document types use explicit `#[serde(rename)]`.

## Invariants

- A `RawDidDocument` always has `source_bytes` matching exactly what the
  server returned — never pretty-printed, never re-serialized.
- `AnyVerifyingKey::verify_prehash` rejects curve mismatches rather than
  silently returning `Ok(())`.
- `HttpClient::get_bytes` returns the HTTP status even for non-2xx responses
  rather than converting them to errors; callers decide what a non-200
  means in context.

## Key files

- `identity.rs` — all of the above, plus extensive unit tests at the bottom.
- `diagnostics.rs``install_miette_handler`, `named_source_from_bytes` /
  `named_source_from_str`, plus the JSON display helpers:
  `pretty_json_for_display` (re-serialize a JSON body so miette's caret
  rendering has newlines to land on), `span_at_line_column` (convert a
  `serde_json::Error` `(line, column)` into a `SourceSpan`, clamped to the
  matched line), and `span_for_quoted_literal` (find the span of a quoted
  JSON key or string value). Every stage that attaches JSON source context
  to a diagnostic should pretty-print the body once up front and compute
  spans against that same pretty body — never mix raw and pretty.

## Gotchas

- `RealHttpClient::new()` builds a fresh `reqwest::Client` with a 10s
  timeout and a User-Agent header. Production code should prefer
  `RealHttpClient::from_client` so identity, HTTP, and crypto stages all
  share one connection pool.
- `resolve_handle` has a DNS-first / HTTPS-fallback order and the HTTPS
  fallback must send a User-Agent. Tests exercise both paths.
- `plc_history_for_fragment` traverses the PLC audit log in wire order
  (oldest-first) and dedupes by multikey string, not by position — the
  same key appearing across multiple rotations collapses to a single
  `PlcHistoricKey` (keeping the earliest introduction's metadata).