Expand description

Overview

This crate provides an implementation of a recursive generalized version of the Levenshtein distance for arbitrary sequences that finds the smallest possible diff between two trees, according to a user-defined measure for the cost of inserting and removing nodes. The smallest possible diff is defined by the the lowest cost sequence of edits that transforms one tree into the other.

Example

use tree_edit_distance::*;
use std::mem::{discriminant, Discriminant};
use std::iter::empty;

enum Json {
    Null,
    Bool(bool),
    Number(f64),
    String(String),
    Array(Vec<Json>),
    Map(Vec<(String, Json)>),
}

impl Node for Json {
    type Kind = Discriminant<Json>;
    fn kind(&self) -> Self::Kind {
        discriminant(self)
    }

    type Weight = u64;
    fn weight(&self) -> Self::Weight {
        1
    }
}

impl Tree for Json {
    type Children<'c> = Box<dyn Iterator<Item = &'c Self> + 'c>
    where
        Self: 'c;

    fn children(&self) -> Self::Children<'_> {
        match self {
            Json::Array(a) => Box::new(a.iter()),
            Json::Map(m) => Box::new(m.iter().map(|(_, v)| v)),
            _ => Box::new(empty()),
        }
    }
}

macro_rules! json {
    ($( $tokens:tt )*) => {
        // ...
    };
}

let john = json! {
    "name": "John Doe",
    "age": 43,
    "phones": [
        "+44 1234567",
        "+44 2345678"
    ]
};

let jane = json! {
    "name": "Jane Doe",
    "maiden name": "Smith",
    "age": 40,
    "phones": [
        "+44 7654321",
    ]
};

let (edits, cost) = diff(&john, &jane);

assert_eq!(cost, 2);

assert_eq!(&*edits, &[
    Edit::Replace(Box::new([
        Edit::Replace(Box::default()),      // "name"
        Edit::Insert,                       // "maiden name"
        Edit::Replace(Box::default()),      // "age"
        Edit::Replace(Box::new([            // "phones"
            Edit::Remove,
            Edit::Replace(Box::default()),
        ])),
    ]))
]);

Enums

A single operation between two Nodes.

Traits

An abstraction for a generic tree node.
An abstraction for a recursive tree.

Functions

Finds the lowest cost sequence of Edits that transforms one Tree into the other.