Crate natural_xml_diff

source ·
Expand description

Natural XML diff

This crate implements a natural XML diff algorithm: given an original document we will name document A, and an edited version of that document we will name document B, the diff algorithm can produce a document describing the differences between A and B.

Many possible descriptions of a difference exist; this algorithm attempts to make the diff appear understandable to the reader and be similar to what be the product of change tracking.

The diff result is described in a number of ways, but the end product is an XML document with annotations that describe the differences.

You can produce a diff by using the diff function:


use natural_xml_diff::diff;

let xml_a = r#"<doc><a/><b/></doc>"#;
let xml_b = r#"<doc><a/><b/><c/></doc>"#;

let diff = diff(xml_a, xml_b).unwrap();

assert_eq!(diff,
  r#"<doc xmlns:diff="http://paligo.net/nxd"><a/><b/><c diff:insert=""/></doc>"#);

The library also exposes a NaturalXmlDiff struct that offers functionality to access the details of the diffing algorithm as well as functionality to verify that the diff produced is correct.

The algorithm implemented by this library is based on the paper “Bridging the gap between tracking and detecting changes on XML”. It is also implemented by the Java-based jndiff library.

The algorithm in this library is different from the paper in various ways:

  • Text updates are detected by using a fast Levenshtein distance algorithm, and are then produced by the diff-match-patch algorithm as implemented by the dissimilar crate.

  • Attribute updates are also detected for elements without children, not just during the propagation phase.

Structs

The position where content is inserted.
A natural XML diff comparison of two documents.

Enums

Describe how a element node’s attributes should be updated.
An edit describes a change to an XML document. Nodes are addressed with usize, which is an index into the complete descendants in the tree (including the root node 0) in document order (pre-order). The indexing includes non-element nodes.
What to insert
The raw status of a node.
Describe how a text node should be updated.

Functions

Given a diff document, apply it.
Given XML document A and XML document B produce a diff document.