Skip to main content

Module algorithm

Module algorithm 

Source
Expand description

URDNA2015 / RDNA 2015 core algorithm implementation.

This module provides a complete, spec-faithful implementation of the W3C RDF Dataset Normalization Algorithm (URDNA2015) as described in: https://www.w3.org/TR/rdf-canon/

§Algorithm Overview

URDNA2015 assigns deterministic canonical blank node identifiers to every blank node in an RDF dataset. The algorithm works in five stages:

  1. Collect blank nodes — enumerate every blank node appearing in any quad.
  2. Hash First-Degree Quads — for each blank node, hash the set of quads it appears in using _:a for the node itself and _:z for all other blanks (unless they already have a canonical ID).
  3. Issue simple IDs — blank nodes with a unique first-degree hash receive their canonical identifier immediately (_:c14n0, _:c14n1, …) in hash order.
  4. Hash N-Degree Quads — for blank nodes that still share a first-degree hash, apply the full recursive N-degree hashing algorithm which walks the RDF neighbourhood to break ties.
  5. Emit canonical N-Quads — replace all blank node identifiers with their canonical form, sort the resulting quads lexicographically, and join with \n.

§Reference

Structs§

Canonicalizer
The main URDNA2015 canonicalizer.
IdentifierIssuer
Issues monotonically increasing canonical blank node identifiers.

Functions§

canonicalize
Canonicalize an RDF dataset (slice of RdfQuads) to a canonical N-Quads string.