# common
Last verified: 2026-04-19
## Purpose
Narrow, mockable primitives shared by every labeler conformance stage.
`common::identity` exists so that every network hop — DNS, HTTPS, PLC
directory lookups, DID document fetches, labeler record fetches — can be
swapped with a recorded fixture in integration tests. It also owns the
signing/verifying key newtypes (`AnySigningKey`, `AnyVerifyingKey`,
`AnySignature`) that every curve-generic operation routes through.
`common::jwt` holds a minimal hand-rolled compact JWS encoder/decoder
used by the report stage to mint self-mint service-auth tokens without
pulling a JWT library. `common::diagnostics` holds the miette
`NamedSource`/`SourceSpan` helpers used when attaching JSON source
context to check failures.
## Contracts
- **Exposes from `identity`**:
- Traits: `HttpClient` (`get_bytes(&Url) -> (u16, Vec<u8>)`),
`DnsResolver` (`txt_lookup(&str) -> Vec<String>`). These are the only
sanctioned seams for network I/O in the whole crate.
- Real implementations: `RealHttpClient` (wraps `reqwest::Client`,
constructible from a shared client via `from_client` so stages can reuse
one TLS pool) and `RealDnsResolver` (hickory).
- Types: `Did`, `DidMethod`, `DidDocument`, `RawDidDocument`, `Service`,
`VerificationMethod`, `Curve`, `AnyVerifyingKey`, `AnySigningKey`,
`AnySignature`, `ParsedMultikey`, `PlcHistoricKey`.
- Resolvers: `resolve_handle`, `resolve_did`, `find_service`,
`parse_multikey`, `encode_multikey`, `plc_history_for_fragment`.
- Classification: `is_local_labeler_hostname(&Url) -> bool` — returns
`true` for loopback, `.local`, and RFC 1918 IPv4 addresses. Drives
the report stage's self-mint viability check.
- `IdentityError` — single error enum covering every resolution failure.
Variants are matched on by the identity stage to emit distinct check
results, so adding or removing variants is a contract change.
- **Signing API**:
- `AnySigningKey::{K256, P256}` mirror `AnyVerifyingKey` for the
signing side. `sign(msg)` and `sign_prehash(&[u8; 32])` return
`AnySignature`. All signatures are low-s normalized — the k256
backend does this automatically; the p256 backend normalizes
explicitly because p256's `sign_prehash` can return high-s.
- `AnySigningKey::verifying_key()` returns the paired `AnyVerifyingKey`.
- `AnySigningKey::jwt_alg()` returns `"ES256K"` or `"ES256"`.
- `AnySignature::to_jws_bytes() -> [u8; 64]` serializes `r || s`
big-endian (JWS raw-signature form, NOT DER).
- `encode_multikey(&AnyVerifyingKey) -> String` is the exact inverse
of `parse_multikey`: base58btc multibase with the multicodec curve
prefix (`0xe701` for secp256k1, `0x8024` for P-256) followed by the
compressed SEC1 point.
- **Exposes from `jwt`**:
- Types: `JwtHeader`, `JwtClaims`, `JwtError`. Field names
(`alg`, `typ`, `iss`, `aud`, `exp`, `iat`, `lxm`, `jti`) are the
exact JSON keys atproto labelers expect — do NOT rename without
adding `#[serde(rename = "...")]`.
- Functions: `encode_compact(&header, &claims, &AnySigningKey)` for
producing compact-form tokens, `decode_compact(token)` for parsing,
`verify_compact(token, &AnyVerifyingKey)` for end-to-end verify.
- Only ES256 and ES256K are supported. `nbf` is deliberately omitted
— the atproto spec does not require it and some servers reject
unexpected claims.
- `JwtError` does NOT derive `miette::Diagnostic` with stable codes.
It is surfaced only inside the stage, which wraps any failure in a
stage-local diagnostic with a `labeler::report::*` code before
rendering.
- **Guarantees**:
- `resolve_did` returns `RawDidDocument` with the original bytes retained
in an `Arc<[u8]>` so downstream stages can build `NamedSource`
diagnostics that point back at the unmodified server response.
- `parse_multikey` accepts only `did:key` / multibase-`z` prefixed inputs
and rejects unknown codec prefixes, wrong lengths, and mismatched
curves. Never panics on malformed input.
- `find_service` matches on the trailing `#fragment` of `Service::id`
anchored to the end of the string (not a substring search).
- `plc_history_for_fragment` returns the set of distinct keys from the
PLC audit log for a given fragment, deduplicated by multikey string
(keeping the earliest introduction). Order is chronological
(oldest-first, matching the PLC API wire order), but the crypto
stage treats the result as a set.
- **Expects**: Callers supply a `&dyn HttpClient` / `&dyn DnsResolver`
rather than constructing their own reqwest/hickory instances. The CLI
wires the real clients once in `LabelerCmd::run` and passes them through
`LabelerOptions`.
## Dependencies
- **Uses**: `reqwest` (rustls, json, gzip), `hickory-resolver`, `k256`,
`p256`, `multibase`, `sha2`, `serde_json`, `miette`, `thiserror`,
`url`. `common::jwt` additionally uses `base64` (URL_SAFE_NO_PAD
engine) and `serde`.
- **Used by**: every stage in `commands/test/labeler/`, plus integration
tests under `tests/`. `common::jwt` is currently only used by the
report stage.
- **Boundary**: `common::identity` and `common::jwt` must not depend on
anything under `commands/`. Stage-specific types (`IdentityFacts`,
`HttpFacts`, etc.) live next to their stage, not here.
## Key decisions
- **Narrow trait seams, not `reqwest::Client` everywhere**: Every previous
refactor that tried to pass `reqwest::Client` directly eventually broke
integration tests. The `HttpClient` trait's two-method surface is
deliberately small so fakes are trivial to write.
- **`Arc<[u8]>` for source bytes**: miette `NamedSource` wants owned bytes
and we fan the same payload out to multiple diagnostics, so every raw
response is stored as `Arc<[u8]>`.
- **Single `IdentityError` enum**: We tried split-per-stage errors and it
forced lossy conversions. One enum, matched exhaustively by the identity
stage, is less code and produces better diagnostics.
- **`did:plc` percent-encoding**: `resolve_did` percent-encodes the DID
before building the PLC directory URL. Do not regress.
- **No `#[serde(flatten)]` / `#[serde(untagged)]`**: Required by project
conventions; all DID document types use explicit `#[serde(rename)]`.
- **All signatures low-s normalized**: `AnySigningKey::sign` guarantees
low-s form for both curves so `AnyVerifyingKey::verify_prehash` round-trip
always succeeds. atproto requires low-s; p256's backend does not enforce
it, so we explicitly call `normalize_s`.
- **Hand-rolled JWT instead of a library**: We only need compact JWS with
ES256/ES256K for a handful of tightly-scoped report-stage tokens. A full
JWT library would pull RSA, HMAC, JWE, and a JSON Schema validator we do
not want. The module is <500 lines and fully covered by round-trip
tests.
- **`is_local_labeler_hostname` is deliberately conservative**: IPv6
private ranges (`fc00::/7`, link-local) are NOT classified as local in
v1. Operators running labelers on IPv6 ULA must pass `--force-self-mint`.
## Invariants
- A `RawDidDocument` always has `source_bytes` matching exactly what the
server returned — never pretty-printed, never re-serialized.
- `AnyVerifyingKey::verify_prehash` rejects curve mismatches rather than
silently returning `Ok(())`.
- `HttpClient::get_bytes` returns the HTTP status even for non-2xx responses
rather than converting them to errors; callers decide what a non-200
means in context.
- `AnySignature::to_jws_bytes()` is always exactly 64 bytes, for both
curves.
- `encode_multikey(parse_multikey(s).verifying_key) == s` for every
well-formed atproto multikey — round-tripping is pinned by unit tests.
- `jwt::verify_compact` accepts exactly three `.`-separated segments.
Four-segment (JWE) or malformed inputs return `JwtError::MalformedCompact`.
## Key files
- `identity.rs` — all resolvers, DID types, signing/verifying key
newtypes, `encode_multikey` / `parse_multikey`,
`is_local_labeler_hostname`, plus extensive unit tests at the bottom.
- `jwt.rs` — compact JWS encoder/decoder for ES256 and ES256K:
`JwtHeader`, `JwtClaims`, `JwtError`, `encode_compact`,
`decode_compact`, `verify_compact`. Segments use unpadded base64url;
signatures are raw `r || s` (not DER) per RFC 7518 §3.4.
- `diagnostics.rs` — `install_miette_handler`, `named_source_from_bytes` /
`named_source_from_str`, plus the JSON display helpers:
`pretty_json_for_display` (re-serialize a JSON body so miette's caret
rendering has newlines to land on), `span_at_line_column` (convert a
`serde_json::Error` `(line, column)` into a `SourceSpan`, clamped to the
matched line), and `span_for_quoted_literal` (find the span of a quoted
JSON key or string value). Every stage that attaches JSON source context
to a diagnostic should pretty-print the body once up front and compute
spans against that same pretty body — never mix raw and pretty.
## Gotchas
- `RealHttpClient::new()` builds a fresh `reqwest::Client` with a 10s
timeout and a User-Agent header. Production code should prefer
`RealHttpClient::from_client` so identity, HTTP, and crypto stages all
share one connection pool.
- `resolve_handle` has a DNS-first / HTTPS-fallback order and the HTTPS
fallback must send a User-Agent. Tests exercise both paths.
- `plc_history_for_fragment` traverses the PLC audit log in wire order
(oldest-first) and dedupes by multikey string, not by position — the
same key appearing across multiple rotations collapses to a single
`PlcHistoricKey` (keeping the earliest introduction's metadata).
- `AnySigningKey` nonces (`jti` in JWTs, run-id in sentinels) come from
`getrandom::getrandom`, not from `rand`. The crate is a direct dep
because the transitive `rand_core` in `elliptic-curve` is built
without the `getrandom` feature.
- p256 `sign_prehash` returns signatures that may be high-s; always go
through `AnySigningKey::sign` / `sign_prehash` rather than calling the
backend trait directly, or low-s normalization will be skipped and
atproto servers will reject the signature.