Expand description
URDNA2015 / RDNA 2015 core algorithm implementation.
This module provides a complete, spec-faithful implementation of the W3C RDF Dataset Normalization Algorithm (URDNA2015) as described in: https://www.w3.org/TR/rdf-canon/
§Algorithm Overview
URDNA2015 assigns deterministic canonical blank node identifiers to every blank node in an RDF dataset. The algorithm works in five stages:
- Collect blank nodes — enumerate every blank node appearing in any quad.
- Hash First-Degree Quads — for each blank node, hash the set of quads
it appears in using
_:afor the node itself and_:zfor all other blanks (unless they already have a canonical ID). - Issue simple IDs — blank nodes with a unique first-degree hash
receive their canonical identifier immediately (
_:c14n0,_:c14n1, …) in hash order. - Hash N-Degree Quads — for blank nodes that still share a first-degree hash, apply the full recursive N-degree hashing algorithm which walks the RDF neighbourhood to break ties.
- Emit canonical N-Quads — replace all blank node identifiers with their
canonical form, sort the resulting quads lexicographically, and join with
\n.
§Reference
- W3C RDF Canonicalization 1.0: https://www.w3.org/TR/rdf-canon/
- URDNA2015 test suite: https://json-ld.github.io/rdf-dataset-normalization/tests/
Structs§
- Canonicalizer
- The main URDNA2015 canonicalizer.
- Identifier
Issuer - Issues monotonically increasing canonical blank node identifiers.
Functions§
- canonicalize
- Canonicalize an RDF dataset (slice of
RdfQuads) to a canonical N-Quads string.