Stateless change-tracking diff engine for CRW monitors.
Pure, synchronous, no I/O, no LLM. Given the current scrape (markdown +
optionally extracted JSON) and a caller-supplied previous snapshot, it
classifies the page (same / changed), computes the requested diff
surfaces, and returns the current snapshot to persist as the next baseline.
Caller-supplied JSON invariant
current_json is the already-extracted structured JSON supplied by the
orchestration layer. This crate NEVER extracts JSON itself and does not
depend on crw-extract — the LLM/judge live upstream.
Mode-aware hashing
content_hash is the normalized-markdown hash in gitDiff/mixed mode, and
the canonicalized tracked-JSON hash in json-only mode. The SaaS store-skip
short-circuit keys off this hash.