# common
Last verified: 2026-04-15
## Purpose
Narrow, mockable primitives shared by every labeler conformance stage.
`common::identity` exists so that every network hop — DNS, HTTPS, PLC
directory lookups, DID document fetches, labeler record fetches — can be
swapped with a recorded fixture in integration tests. `common::diagnostics`
holds the miette `NamedSource`/`SourceSpan` helpers used when attaching JSON
source context to check failures.
## Contracts
- **Exposes from `identity`**:
- Traits: `HttpClient` (`get_bytes(&Url) -> (u16, Vec<u8>)`),
`DnsResolver` (`txt_lookup(&str) -> Vec<String>`). These are the only
sanctioned seams for network I/O in the whole crate.
- Real implementations: `RealHttpClient` (wraps `reqwest::Client`,
constructible from a shared client via `from_client` so stages can reuse
one TLS pool) and `RealDnsResolver` (hickory).
- Types: `Did`, `DidMethod`, `DidDocument`, `RawDidDocument`, `Service`,
`VerificationMethod`, `Curve`, `AnyVerifyingKey`, `AnySignature`,
`ParsedMultikey`, `PlcHistoricKey`.
- Resolvers: `resolve_handle`, `resolve_did`, `find_service`,
`parse_multikey`, `plc_history_for_fragment`.
- `IdentityError` — single error enum covering every resolution failure.
Variants are matched on by the identity stage to emit distinct check
results, so adding or removing variants is a contract change.
- **Guarantees**:
- `resolve_did` returns `RawDidDocument` with the original bytes retained
in an `Arc<[u8]>` so downstream stages can build `NamedSource`
diagnostics that point back at the unmodified server response.
- `parse_multikey` accepts only `did:key` / multibase-`z` prefixed inputs
and rejects unknown codec prefixes, wrong lengths, and mismatched
curves. Never panics on malformed input.
- `find_service` matches on the trailing `#fragment` of `Service::id`
anchored to the end of the string (not a substring search).
- `plc_history_for_fragment` returns the set of distinct keys from the
PLC audit log for a given fragment, deduplicated by multikey string
(keeping the earliest introduction). Order is chronological
(oldest-first, matching the PLC API wire order), but the crypto
stage treats the result as a set.
- **Expects**: Callers supply a `&dyn HttpClient` / `&dyn DnsResolver`
rather than constructing their own reqwest/hickory instances. The CLI
wires the real clients once in `LabelerCmd::run` and passes them through
`LabelerOptions`.
## Dependencies
- **Uses**: `reqwest` (rustls, json, gzip), `hickory-resolver`, `k256`,
`p256`, `multibase`, `sha2`, `serde_json`, `miette`, `thiserror`.
- **Used by**: every stage in `commands/test/labeler/`, plus integration
tests under `tests/`.
- **Boundary**: `common::identity` must not depend on anything under
`commands/`. Stage-specific types (`IdentityFacts`, `HttpFacts`, etc.)
live next to their stage, not here.
## Key decisions
- **Narrow trait seams, not `reqwest::Client` everywhere**: Every previous
refactor that tried to pass `reqwest::Client` directly eventually broke
integration tests. The `HttpClient` trait's two-method surface is
deliberately small so fakes are trivial to write.
- **`Arc<[u8]>` for source bytes**: miette `NamedSource` wants owned bytes
and we fan the same payload out to multiple diagnostics, so every raw
response is stored as `Arc<[u8]>`.
- **Single `IdentityError` enum**: We tried split-per-stage errors and it
forced lossy conversions. One enum, matched exhaustively by the identity
stage, is less code and produces better diagnostics.
- **`did:plc` percent-encoding**: `resolve_did` percent-encodes the DID
before building the PLC directory URL. Do not regress.
- **No `#[serde(flatten)]` / `#[serde(untagged)]`**: Required by project
conventions; all DID document types use explicit `#[serde(rename)]`.
## Invariants
- A `RawDidDocument` always has `source_bytes` matching exactly what the
server returned — never pretty-printed, never re-serialized.
- `AnyVerifyingKey::verify_prehash` rejects curve mismatches rather than
silently returning `Ok(())`.
- `HttpClient::get_bytes` returns the HTTP status even for non-2xx responses
rather than converting them to errors; callers decide what a non-200
means in context.
## Key files
- `identity.rs` — all of the above, plus extensive unit tests at the bottom.
- `diagnostics.rs` — `install_miette_handler`, `named_source_from_bytes` /
`named_source_from_str`, plus the JSON display helpers:
`pretty_json_for_display` (re-serialize a JSON body so miette's caret
rendering has newlines to land on), `span_at_line_column` (convert a
`serde_json::Error` `(line, column)` into a `SourceSpan`, clamped to the
matched line), and `span_for_quoted_literal` (find the span of a quoted
JSON key or string value). Every stage that attaches JSON source context
to a diagnostic should pretty-print the body once up front and compute
spans against that same pretty body — never mix raw and pretty.
## Gotchas
- `RealHttpClient::new()` builds a fresh `reqwest::Client` with a 10s
timeout and a User-Agent header. Production code should prefer
`RealHttpClient::from_client` so identity, HTTP, and crypto stages all
share one connection pool.
- `resolve_handle` has a DNS-first / HTTPS-fallback order and the HTTPS
fallback must send a User-Agent. Tests exercise both paths.
- `plc_history_for_fragment` traverses the PLC audit log in wire order
(oldest-first) and dedupes by multikey string, not by position — the
same key appearing across multiple rotations collapses to a single
`PlcHistoricKey` (keeping the earliest introduction's metadata).