Eddie
Fast and well-tested implementations of edit distance/string similarity metrics:
- Levenshtein,
- Damerau-Levenshtein,
- Hamming,
- Jaro,
- Jaro-Winkler.
Documentation
See API reference.
Installation
Add this to your Cargo.toml
:
[]
= "0.3"
Basic usage
Levenshtein:
use Levenshtein;
let lev = new;
let dist = lev.distance;
assert_eq!;
Damerau-Levenshtein:
use DamerauLevenshtein;
let damlev = new;
let dist = damlev.distance;
assert_eq!;
Hamming:
use Hamming;
let hamming = new;
let dist = hamming.distance;
assert_eq!;
Jaro:
use Jaro;
let jaro = new;
let sim = jaro.similarity;
assert!;
Jaro-Winkler:
use JaroWinkler;
let jarwin = new;
let sim = jarwin.similarity;
assert!;
Complementary metrics
The main metric methods are complemented with inverted and/or relative versions. The naming convention across the crate is following:
distance
— a number of edits required to transform one string to the other;rel_dist
— a distance between two strings, relative to string length (inversion of similarity);similarity
— similarity between two strings (inversion of relative distance).
Performance
At the moment Eddie has the fastest implementations among the alternatives from crates.io that have Unicode support.
For example, when comparing common english words you can expect at least 1.5-2x speedup for any given algorithm except Hamming.
For the detailed measurements tables see Benchmarks page.