Crate tree_edit_distance[][src]

Expand description

Overview

This crate provides an implementation of a recursive generalized version of the Levenshtein distance for arbitrary sequences that finds the smallest possible diff between two trees, according to a user-defined measure for the cost of inserting and removing nodes. The smallest possible diff is defined by the the lowest cost sequence of edits that transforms one tree into the other.

Example

use tree_edit_distance::{Edit, diff, Node};
use std::mem::{discriminant, Discriminant};

enum Json {
    Null,
    Bool(bool),
    Number(f64),
    String(String),
    Array(Vec<Json>),
    Map(Vec<(String, Json)>),
}

impl<'n> Node<'n> for Json {
    type Kind = Discriminant<Json>;
    fn kind(&self) -> Self::Kind {
        discriminant(self)
    }

    type Weight = u64;
    fn weight(&self) -> Self::Weight {
        1
    }

    type Child = &'n Self;
    type Children = Box<[Self::Child]>;
    fn children(&'n self) -> Self::Children {
        match self {
            Json::Array(a) => a.iter().collect(),
            Json::Map(m) => m.iter().map(|(_, v)| v).collect(),
            _ => Box::default(),
        }
    }
}

macro_rules! json {
    ($( $tokens:tt )*) => {
        // ...
    };
}

let john = json! {
    "name": "John Doe",
    "age": 43,
    "phones": [
        "+44 1234567",
        "+44 2345678"
    ]
};

let jane = json! {
    "name": "Jane Doe",
    "maiden name": "Smith",
    "age": 40,
    "phones": [
        "+44 7654321",
    ]
};

let (edits, cost) = diff(&john, &jane);

assert_eq!(cost, 2);

assert_eq!(&*edits, &[
    Edit::Replace(Box::new([
        Edit::Replace(Box::default()),      // "name"
        Edit::Insert,                       // "maiden name"
        Edit::Replace(Box::default()),      // "age"
        Edit::Replace(Box::new([            // "phones"
            Edit::Replace(Box::default()),
            Edit::Remove
        ])),
    ]))
]);

Enums

A single operation between two Nodes.

Traits

An abstraction for a generic tree.

Functions

Finds the lowest cost sequence of Edits that transforms one tree of Nodes into the other.