diff_match_patch_rs/
lib.rs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
//! # Efficient port of Google's diff-match-patch implemented in Rust
//!
//! [<img alt="github" src="https://img.shields.io/badge/github-Anubhab/diff_match_patch_rs-8da0cb?style=for-the-badge&labelColor=555555&logo=github" height="20">](https://github.com/AnubhabB/diff-match-patch-rs)
//! [<img alt="crates.io" src="https://img.shields.io/crates/v/diff-match-patch-rs" height="20">](https://crates.io/crates/diff-match-patch-rs)
//! [<img alt="docs.rs" src="https://img.shields.io/badge/docs.rs-diff_match_patch_rs?style=for-the-badge&logo=docs.rs&labelColor=%23555555" height="20">](https://docs.rs/diff-match-patch-rs)
//!
//!
//! A very **fast**, **accurate** and **wasm ready** port of [Diff Match Patch](https://github.com/dmsnell/diff-match-patch) in Rust. The
//! diff implementation is based on [Myers' diff algorithm](https://neil.fraser.name/writing/diff/myers.pdf).
//!
//! ## Highlights of this crate
//! - Exposes two modes of operating with `diff-match-patch`, a `Efficient` mode and `Compat` mode. While the `Efficient` mode squeezes out the max performance the `Compat` mode ensures compatibility across other libraries or implementations (rust or otherwise). According to [Benchmarks](#benchmarks), our slower `Compat` mode is still faster than other implementations in rust.
//!     - **`Efficient`** mode works on `&[u8]` and the generated diffs break compatibility with other implementation. Use the **`Efficient`** mode ONLY if you are using this [crate](https://crates.io/crates/diff-match-patch-rs) at the source of diff generation and the destination.
//!     - **`Compat`** mode on the other hand works on `&[char]` and the generated `diffs` and `patches` are compatible across other implementations of `diff-match-patch`. Checkout `tests/compat.rs` for some test cases around this.
//! -  `wasm` ready, you can check out a [demo here](https://github.com/AnubhabB/wasm-diff.git)
//! - **Accurate**, while working on this crate I've realized that there are a bunch of implementations that have major issues (wrong diffs, inaccurate flows, silent errors etc.).
//! - Helper method **pretty_html** provided by this crate allows some configurations to control the generated visuals elements.
//! - Well tested
//! - Added a `fuzzer` for sanity
//! - Exposes the same APIs as [Diff Match Patch](https://github.com/dmsnell/diff-match-patch) with minor changes to make it more idiomatic in Rust.
//!
//! ## Usage Examples
//!
//! ```toml
//! [dependencies]
//! diff-match-patch-rs = "0.3.2"
//! ```
//!
//! ### `Effitient` mode
//!
//! ```rust
//! use diff_match_patch_rs::{DiffMatchPatch, Efficient, Error, PatchInput};
//!
//! // This is the source text
//! const TXT_OLD: &str = "I am the very model of a modern Major-General, I've information on vegetable, animal, and mineral, ๐Ÿš€๐Ÿ‘๐Ÿ‘€";
//!
//! // Let's assume this to be the text that was editted from the source text
//! const TXT_NEW: &str = "I am the very model of a cartoon individual, My animation's comical, unusual, and whimsical.๐Ÿ˜Š๐Ÿ‘€";
//!
//! // An example of a function that creates a diff and returns a set of patches serialized
//! fn at_source() -> Result<String, Error> {
//!     // initializing the module
//!     let dmp = DiffMatchPatch::new();
//!     // create a list of diffs
//!     let diffs = dmp.diff_main::<Efficient>(TXT_OLD, TXT_NEW)?;
//!     // Now, we are going to create a list of `patches` to be applied to the old text to get the new text
//!     let patches = dmp.patch_make(PatchInput::new_diffs(&diffs))?;
//!     // in the real world you are going to transmit or store this diff serialized to undiff format to be consumed or used somewhere elese
//!     let patch_txt = dmp.patch_to_text(&patches);
//!
//!     Ok(patch_txt)
//! }
//!
//! fn at_destination(patches: &str) -> Result<(), Error> {
//!     // initializing the module
//!     let dmp = DiffMatchPatch::new();
//!     // lets recreate the diffs from patches
//!     let patches = dmp.patch_from_text::<Efficient>(patches)?;
//!     // Now, lets apply these patches to the `old_txt` which is the original to get the new text
//!     let (new_txt, ops) = dmp.patch_apply(&patches, TXT_OLD)?;
//!     // Lets print out if the ops succeeded or not
//!     ops.iter()
//!         .for_each(|&o| println!("{}", if o { "OK" } else { "FAIL" }));
//!
//!     // If everything goes as per plan you should see
//!     // OK
//!     // OK
//!     // ... and so on
//!
//!     // lets check out if our `NEW_TXT` (presumably the edited one)
//!     if new_txt != TXT_NEW {
//!         return Err(Error::InvalidInput);
//!     }
//!
//!     println!("Wallah! Patch applied successfully!");
//!
//!     Ok(())
//! }
//!
//! fn main() -> Result<(), Error> {
//!     // At the source of diff where the old text is being edited we'll create a set of patches
//!     let patches = at_source()?;
//!     // We'll send this diff to some destination e.g. db or the client where these changes are going to be applied
//!     // The destination will receive the patch string and will apply the patches to recreate the edits
//!     at_destination(&patches)
//! }
//!
//! ```
//!
//! ### `Compat` mode
//!
//! ```rust
//! use diff_match_patch_rs::{DiffMatchPatch, Compat, Error, PatchInput};
//!
//! // This is the source text
//! const TXT_OLD: &str = "I am the very model of a modern Major-General, I've information on vegetable, animal, and mineral, ๐Ÿš€๐Ÿ‘๐Ÿ‘€";
//!
//! // Let's assume this to be the text that was editted from the source text
//! const TXT_NEW: &str = "I am the very model of a cartoon individual, My animation's comical, unusual, and whimsical.๐Ÿ˜Š๐Ÿ‘€";
//!
//! // An example of a function that creates a diff and returns a set of patches serialized
//! fn at_source() -> Result<String, Error> {
//!     // initializing the module
//!     let dmp = DiffMatchPatch::new();
//!     // create a list of diffs
//!     let diffs = dmp.diff_main::<Compat>(TXT_OLD, TXT_NEW)?;
//!     // Now, we are going to create a list of `patches` to be applied to the old text to get the new text
//!     let patches = dmp.patch_make(PatchInput::new_diffs(&diffs))?;
//!     // in the real world you are going to transmit or store this diff serialized to undiff format to be consumed or used somewhere elese
//!     let patch_txt = dmp.patch_to_text(&patches);

//!     Ok(patch_txt)
//! }
//!
//! fn at_destination(patches: &str) -> Result<(), Error> {
//!     // initializing the module
//!     let dmp = DiffMatchPatch::new();
//!     // lets recreate the diffs from patches
//!     let patches = dmp.patch_from_text::<Compat>(patches)?;
//!     // Now, lets apply these patches to the `old_txt` which is the original to get the new text
//!     let (new_txt, ops) = dmp.patch_apply(&patches, TXT_OLD)?;
//!     // Lets print out if the ops succeeded or not
//!     ops.iter()
//!         .for_each(|&o| println!("{}", if o { "OK" } else { "FAIL" }));
//!
//!     // If everything goes as per plan you should see
//!     // OK
//!     // OK
//!     // ... and so on
//!
//!     // lets check out if our `NEW_TXT` (presumably the edited one)
//!     if new_txt != TXT_NEW {
//!         return Err(Error::InvalidInput);
//!     }
//!
//!     println!("Wallah! Patch applied successfully!");
//!
//!     Ok(())
//! }
//!
//! fn main() -> Result<(), Error> {
//!     // At the source of diff where the old text is being edited we'll create a set of patches
//!     let patches = at_source()?;
//!     // We'll send this diff to some destination e.g. db or the client where these changes are going to be applied
//!     // The destination will receive the patch string and will apply the patches to recreate the edits
//!     at_destination(&patches)
//! }
//! ```
//! ### `Match` - fuzzy match of pattern in Text
//!
//! ```rust
//! use diff_match_patch_rs::{DiffMatchPatch, Efficient, Error, PatchInput};
//! // This is the source text
//! const TXT: &str = "I am the very model of a modern Major-General, I've information on vegetable, animal, and mineral, ๐Ÿš€๐Ÿ‘๐Ÿ‘€";
//!
//! // The patter we are trying to fing
//! const PATTERN: &str = " that berry ";
//!
//! // Returns `location` of match if found, `None` if not found
//! fn main() {
//!     let dmp = DiffMatchPatch::new();
//!
//!     // works with both `Efficient` and `Compat` modes
//!     // `5` here is an approx location to find `nearby` matches
//!     let res = dmp.match_main::<Efficient>(TXT, PATTERN, 5);
//!     println!("{:?}", res); // Should print `Some(4)`
//! }
//! ```
//!
//! #### Note
//! The `Efficient` and `Compat` mode APIs are identical with the only chage being the `generic` parameter declared during the calls.
//!
//! E.g. we initiated a `diff` in the `Efficient` mode with `dmp.diff_main::<Efficient>( ... )` while for `Compat` mode we did `dmp.diff_main::<Compat>( ... )`.
//!
//! Please checkout the `examples` directory of the [source repo](https://github.com/AnubhabB/diff-match-patch-rs/tree/main/examples) for a few common use-cases.
//!
//! <div class="warning">The `Effitient` and `Compat` modes are mutually exclusive and will not generate correct output if used interchangibly at source and destination</div>
//!
//! ## Benchmarks
//! Benchmarks are maintained [diff-match-patch-bench repository](https://github.com/AnubhabB/diff-match-patch-rs-bench)
//!
//! | Lang.   | Library                                                                                  | Diff Avg. | Patch Avg. | Bencher    | Mode        | Correct |
//! |:-------:|:----------------------------------------------------------------------------------------:|:---------:|:----------:|:----------:|:-----------:|:-------:|
//! | `rust`  | [diff_match_patch v0.1.1](https://crates.io/crates/diff_match_patch)[^2]                 | 68.108 ms | 10.596 ms  | Criterion  | -           |    โœ…   |
//! | `rust`  | [dmp v0.2.0](https://crates.io/crates/dmp)                                               | 69.019 ms | 14.654 ms  | Criterion  | -           |    โœ…   |
//! | `rust`  | [diff-match-patch-rs](https://github.com/AnubhabB/diff-match-patch-rs.git)<sup>our</sup> | 64.66 ms  | 631.13 ยตs  | Criterion  | `Efficient` |    โœ…   |
//! | `rust`  | [diff-match-patch-rs](https://github.com/AnubhabB/diff-match-patch-rs.git)<sup>our</sup> | 64.68 ms  | 1.1703 ms  | Criterion  | `Compat`    |    โœ…   |
//! | `go`    | [go-diff](https://github.com/sergi/go-diff)                                              | 50.31 ms  | 135.2 ms   | go test    | -           |    โœ…   |
//! | `node`  | [diff-match-patch](https://www.npmjs.com/package/diff-match-patch)[^1]                   | 246.90 ms | 1.07 ms    | tinybench  | -           |    โŒ   |
//! | `python`| [diff-match-patch](https://pypi.org/project/diff-match-patch/)                           | 1.01 s    | 0.25 ms    | timeit     | -           |    โœ…   |
//!
//! [^1]: [diff-match-patch](https://www.npmjs.com/package/diff-match-patch) generated `patch text` and `delta` breaks on `unicode surrogates`.
//! [^2]: Adds an extra clone to the iterator because the `patch_apply` method takes mutable refc. to `diffs`.
//!
//!
//! ## Gotchas
//! **Diff incompatibility with `JavaScript` libs**:
//!
//! There are 2 kinds of implementations - one which use a `postprocessing` function for merging `unicode surrogates` which break compatibility with every other popular `diff-match-patch` implementations and the other kind (packages based on the original implementation) break while `urlEncode()` of unicode surrogates.
//! As of now, this crate brakes compatibility while working with `JS` generated diffs with the surrogate patch.
//! If you are interfacing with `JavaScript` in browser, using this crate through `wasm` would be ideal.
//!

pub mod dmp;
pub mod errors;
pub mod fuzz;
pub mod html;
pub mod patch_input;
pub mod traits;

pub use dmp::{DiffMatchPatch, Ops, Patch, Patches};
pub use errors::Error;
pub use html::HtmlConfig;
pub use patch_input::PatchInput;
pub(crate) use traits::DType;
pub use traits::{Compat, Efficient};