diff-match-patch-rs: Efficient port of Google's diff-match-patch implemented in Rust
A very fast, accurate and wasm ready port of Diff Match Patch in Rust. The diff implementation is based on Myers' diff algorithm.
Highlights of this crate
- Exposes two modes of operating with
diff-match-patch
, aEfficient
mode andCompat
mode. While theEfficient
mode squeezes out the max performance theCompat
mode ensures compatibility across other libraries or implementations (rust or otherwise). According to Benchmarks, our slowerCompat
mode is still faster than other implementations in rust.Efficient
mode works on&[u8]
and the generated diffs break compatibility with other implementation. Use theEfficient
mode ONLY if you are using this crate at the source of diff generation and the destination.Compat
mode on the other hand works on&[char]
and the generateddiffs
andpatches
are compatible across other implementations ofdiff-match-patch
. Checkouttests/compat.rs
for some test cases around this.
wasm
ready, you can check out a demo here- Accurate, while working on this crate I've realized that there are a bunch of implementations that have major issues (wrong diffs, inaccurate flows, silent errors etc.).
- Helper method pretty_html provided by this crate allows some configurations to control the generated visuals elements.
- Well tested
- Added a
fuzzer
for sanity - Exposes the same APIs as Diff Match Patch with minor changes to make it more idiomatic in Rust.
Usage Examples
[]
= "0.2.0"
Effitient
mode
use ;
// This is the source text
const TXT_OLD: &str = "I am the very model of a modern Major-General, I've information on vegetable, animal, and mineral, ๐๐๐";
// Let's assume this to be the text that was editted from the source text
const TXT_NEW: &str = "I am the very model of a cartoon individual, My animation's comical, unusual, and whimsical.๐๐";
// An example of a function that creates a diff and returns a set of patches serialized
Compat
mode
use ;
// This is the source text
const TXT_OLD: &str = "I am the very model of a modern Major-General, I've information on vegetable, animal, and mineral, ๐๐๐";
// Let's assume this to be the text that was editted from the source text
const TXT_NEW: &str = "I am the very model of a cartoon individual, My animation's comical, unusual, and whimsical.๐๐";
// An example of a function that creates a diff and returns a set of patches serialized
Note
The Efficient
and Compat
mode APIs are identical with the only chage being the generic
parameter declared during the calls.
E.g. we initiated a diff
in the Efficient
mode with dmp.diff_main::<Efficient>( ... )
while for Compat
mode we did dmp.diff_main::<Compat>( ... )
.
Please checkout the examples
directory of the source repo for a few common use-cases.
Benchmarks
Benchmarks are maintained diff-match-patch-bench repository
Lang. | Library | Diff Avg. | Patch Avg. | Bencher | Mode | Correct |
---|---|---|---|---|---|---|
rust |
diff_match_patch v0.1.1[^2] | 68.108 ms | 10.596 ms | Criterion | - | โ |
rust |
dmp v0.2.0 | 69.019 ms | 14.654 ms | Criterion | - | โ |
rust |
diff-match-patch-rsour | 64.66 ms | 631.13 ยตs | Criterion | Efficient |
โ |
rust |
diff-match-patch-rsour | 64.68 ms | 1.1703 ms | Criterion | Compat |
โ |
go |
go-diff | 50.31 ms | 135.2 ms | go test | - | โ |
node |
diff-match-patch[^1] | 246.90 ms | 1.07 ms | tinybench | - | โ |
python |
diff-match-patch | 1.01 s | 0.25 ms | timeit | - | โ |
[^1]: diff-match-patch generated patch text
and delta
breaks on unicode surrogates
.
[^2]: Adds an extra clone to the iterator because the patch_apply
method takes mutable refc. to diffs
.
Gotchas
Diff incompatibility with JavaScript
libs:
There are 2 kinds of implementations - one which use a postprocessing
function for merging unicode surrogates
which break compatibility with every other popular diff-match-patch
implementations and the other kind (packages based on the original implementation) break while urlEncode()
of unicode surrogates.
As of now, this crate brakes compatibility while working with JS
generated diffs with the surrogate patch.
If you are interfacing with JavaScript
in browser, using this crate through wasm
would be ideal.
Related projects
Diff Match Patch was originally built in 2006 to power Google Docs.
- Diff Match Patch (and it's fork)
- Rust: Distil.io diff_match_patch
- Rust: dmp
- Rust: Dissimilar by the awesome David Tolnay
- Rust: diff_match_patch