Expand description
This crate implements a natural XML diff algorithm: given an original document we will name document A, and an edited version of that document we will name document B, the diff algorithm can produce a document describing the differences between A and B.
Many possible descriptions of a difference exist; this algorithm attempts to make the diff appear understandable to the reader and be similar to what be the product of change tracking.
The diff result is described in a number of ways, but the end product is an XML document with annotations that describe the differences.
You can produce a diff by using the diff function:
use natural_xml_diff::diff;
let xml_a = r#"<doc><a/><b/></doc>"#;
let xml_b = r#"<doc><a/><b/><c/></doc>"#;
let diff = diff(xml_a, xml_b).unwrap();
assert_eq!(diff,
r#"<doc xmlns:diff="http://paligo.net/nxd"><a/><b/><c diff:insert=""/></doc>"#);The library also exposes a NaturalXmlDiff struct that offers
functionality to access the details of the diffing algorithm as well as
functionality to verify that the diff produced is correct.
The algorithm implemented by this library is based on the paper “Bridging the gap between tracking and detecting changes on XML”. It is also implemented by the Java-based jndiff library.
The algorithm in this library is different from the paper in various ways:
-
Text updates are detected by using a fast Levenshtein distance algorithm, and are then produced by the
diff-match-patchalgorithm as implemented by thedissimilarcrate. -
Attribute updates are also detected for elements without children, not just during the propagation phase.
Structs§
- Insert
Position - The position where content is inserted.
- Natural
XmlDiff - A natural XML diff comparison of two documents.
Enums§
- Attribute
Change - Describe how a element node’s attributes should be updated.
- Edit
- An edit describes a change to an XML document.
Nodes are addressed with
usize, which is an index into the complete descendants in the tree (including the root node 0) in document order (pre-order). The indexing includes non-element nodes. - Insert
Content - What to insert
- Status
- The raw status of a node.
- Text
Change - Describe how a text node should be updated.