1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
//! This crate implements diffing utilities.  It attempts to provide an abstraction
//! interface over different types of diffing algorithms.  It's based on the
//! the diff algorithm implementations of [pijul](https://pijul.org/).
//!
//! The API of the crate is split into high and low level functionality.  Most
//! of what you probably want to use is available top level.  Additionally the
//! following sub modules exist:
//!
//! * [`algorithms`]: This implements the different types of diffing algorithms.
//!   It provides both low level access to the algorithms with the minimal
//!   trait bounds necessary, as well as a generic interface.
//! * [`udiff`]: Unified diff functionality.
//!
//! # Sequence Diffing
//!
//! If you want to diff sequences generally indexable things you can use the
//! [`capture_diff`] and [`capture_diff_slices`] functions.  They will directly
//! diff an indexable object or slice and return a vector of [`DiffOp`] objects.
//!
//! ```rust
//! use similar::{Algorithm, capture_diff_slices};
//!
//! let a = vec![1, 2, 3, 4, 5];
//! let b = vec![1, 2, 3, 4, 7];
//! let ops = capture_diff_slices(Algorithm::Myers, &a, &b);
//! ```
//!
//! # Text Diffing
//!
//! Similar provides helpful utilities for text (and more specifically line) diff
//! operations.  The main type you want to work with is [`TextDiff`] which
//! uses the underlying diff algorithms to expose a convenient API to work with
//! texts:
//!
//! ```rust
//! # #[cfg(feature = "text")] {
//! use similar::{ChangeTag, TextDiff};
//!
//! let diff = TextDiff::from_lines(
//!     "Hello World\nThis is the second line.\nThis is the third.",
//!     "Hallo Welt\nThis is the second line.\nThis is life.\nMoar and more",
//! );
//!
//! for op in diff.ops() {
//!     for change in diff.iter_changes(op) {
//!         let sign = match change.tag() {
//!             ChangeTag::Delete => "-",
//!             ChangeTag::Insert => "+",
//!             ChangeTag::Equal => " ",
//!         };
//!         print!("{}{}", sign, change);
//!     }
//! }
//! # }
//! ```
//!
//! ## Trailing Newlines
//!
//! When working with line diffs (and unified diffs in general) there are two
//! "philosophies" to look at lines.  One is to diff lines without their newline
//! character, the other is to diff with the newline character.  Typically the
//! latter is done because text files do not _have_ to end in a newline character.
//! As a result there is a difference between `foo\n` and `foo` as far as diffs
//! are concerned.
//!
//! In similar this is handled on the [`Change`] or [`InlineChange`] level.  If
//! a diff was created via [`TextDiff::from_lines`] the text diffing system is
//! instructed to check if there are missing newlines encountered
//! ([`TextDiff::newline_terminated`] returns true).
//!
//! In any case the [`Change`] object has a convenience method called
//! [`Change::missing_newline`] which returns `true` if the change is missing
//! a trailing newline.  Armed with that information the caller knows to handle
//! this by either rendering a virtual newline at that position or to indicate
//! it in different ways.  For instance the unified diff code will render the
//! special `\ No newline at end of file` marker.
//!
//! ## Bytes vs Unicode
//!
//! Similar module concerns itself with a loser definition of "text" than you would
//! normally see in Rust.  While by default it can only operate on [`str`] types
//! by enabling the `bytes` feature it gains support for byte slices with some
//! caveats.
//!
//! A lot of text diff functionality assumes that what is being diffed constitutes
//! text, but in the real world it can often be challenging to ensure that this is
//! all valid utf-8.  Because of this the crate is built so that most functionality
//! also still works with bytes for as long as they are roughly ASCII compatible.
//!
//! This means you will be successful in creating a unified diff from latin1
//! encoded bytes but if you try to do the same with EBCDIC encoded bytes you
//! will only get garbage.
//!
//! # Ops vs Changes
//!
//! Because very commonly two compared sequences will largely match this module
//! splits it's functionality into two layers:
//!
//! Changes are encoded as [diff operations](crate::DiffOp).  These are
//! ranges of the differences by index in the source sequence.  Because this
//! can be cumbersome to work with a separate method [`DiffOp::iter_changes`]
//! (and [`TextDiff::iter_changes`] when working with text diffs) is provided
//! which expands all the changes on an item by item level encoded in an operation.
//!
//! As the [`TextDiff::grouped_ops`] method can isolate clusters of changes
//! this even works for very long files if paired with this method.
//!
//! # Feature Flags
//!
//! The crate by default does not have any dependencies however for some use
//! cases it's useful to pull in extra functionality.  Likewise you can turn
//! off some functionality.
//!
//! * `text`: this feature is enabled by default and enables the text based
//!   diffing types such as [`TextDiff`].
//!   If the crate is used without default features it's removed.
//! * `unicode`: when this feature is enabled the text diffing functionality
//!   gains the ability to diff on a grapheme instead of character level.  This
//!   is particularly useful when working with text containing emojis.  This
//!   pulls in some relatively complex dependencies for working with the unicode
//!   database.
//! * `bytes`: this feature adds support for working with byte slices in text
//!   APIs in addition to unicode strings.  This pulls in the
//!   [`bstr`] dependency.
//! * `inline`: this feature gives access to additional functionality of the
//!   text diffing to provide inline information about which values changed
//!   in a line diff.  This currently also enables the `unicode` feature.
#![warn(missing_docs)]
pub mod algorithms;
#[cfg(feature = "text")]
pub mod udiff;

mod common;
#[cfg(feature = "text")]
mod text;
mod types;

pub use self::common::*;
#[cfg(feature = "text")]
pub use self::text::*;
pub use self::types::*;