1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
//! # Deencode: Reverse engineer encoding errors
//!
//! The goal of this crate is to automatically explore the result of
//! successively encoding then decoding a string using different encoding
//! schemes, which usually results in some corruption of the non-ASCII
//! characters.
//!
//! ## Concepts
//!
//! * [Engines](engine/trait.Engine.html) are objects that represent an encoding
//! scheme, and can be used to encode (String to bytes) or decode (bytes to
//! String). A number of engines are already implemented into this crate, with
//! static instances if you want to use them.
//! * The structure of deencoding is a
//! [tree](deencodetree/struct.DeencodeTree.html): from an input string, every
//! engine may give an encoding, then every engine gives a decoding of that
//! encoding, and so on.
//!
//! > _Note_: The deencoding process is not optimised to avoid doing the same
//! > steps over and over. It is recommended to keep the depth to small numbers.
//! > Deduplication can then be applied to remove duplication in the tree.
//!
//! ## Usage
//!
//! ```rust
//! use deencode::*;
//!
//! // List the engines to use.
//! let engines: Vec<&dyn Engine> = vec![&UTF8, &LATIN1, &MIXED816BE, &MIXED816LE, &UTF7];
//! // Explore the tree of possible encodings and decodings.
//! let mut tree = deencode("Clément", &engines, 1);
//! // Remove duplicate entries from the tree.
//! let _ = tree.deduplicate();
//!
//! // Export the tree with box drawings.
//! println!("{}", tree);
//! // Export the tree as JSON.
//! println!("{}", serde_json::to_string(&tree).unwrap());
//! ```
pub use Engine;
pub use DeencodeTree;
/// Provided engine ISO-8859-7 / Codepage 1253.
pub static CP1253: CP1253Engine = CP1253Engine ;
/// Provided engine ISO-8859-9 / Codepage 1254.
pub static CP1254: CP1254Engine = CP1254Engine ;
/// Provided engine ISO-8859-8 / Codepage 1255.
pub static CP1255: CP1255Engine = CP1255Engine ;
/// Provided engine for Latin-1 / ISO-8859-1 / Codepage 1252.
pub static LATIN1: Latin1Engine = Latin1Engine ;
/// Provided engine for Latin-2 / ISO-8859-2 / Codepage 1250.
pub static LATIN2: Latin2Engine = Latin2Engine ;
/// Provided engine for a mixed UTF-8/UTF-16BE scheme.
pub static MIXED816BE: Mixed816BEEngine =
Mixed816BEEngine ;
/// Provided engine for a mixed UTF-8/UTF-16LE scheme.
pub static MIXED816LE: Mixed816LEEngine =
Mixed816LEEngine ;
/// Provided engine for UTF-7.
pub static UTF7: Utf7Engine = Utf7Engine ;
/// Provided engine for UTF-8.
pub static UTF8: Utf8Engine = Utf8Engine ;
/// Build a [`DeencodeTree`] by successively running encodings and decodings
/// through the engines.
///
/// Alias of [`DeencodeTree::deencode()`].
///
/// `encoding_depth` specifies the number of _encoding_ steps, which are always
/// followed by a decoding step, so the actual depth of the generated tree is
/// `2 * encoding_depth`.
///
/// The process starts with encoding, so you may not have `depth == 0`. (see
/// [`EncodeNode::make_nodes()`](deencodetree::EncodeNode::make_nodes)'s
/// documentation)
///
/// The order of the engines matters for [`DeencodeTree::deduplicate()`].