Crate data_encoding [−] [src]
Correct, efficient, canonical, and generic data-encoding functions
This crate provides little-endian ASCII base-conversion encodings for bases of size 2, 4, 8, 16, 32, and 64. It supports both padded and non-padded encodings. It supports canonical encodings (trailing bits are checked). It supports in-place encoding and decoding functions. It supports non-canonical symbols. And it supports both most and least significant bit-order. The performance of the encoding and decoding functions are similar to existing implementations (see how to run the benchmarks on github).
This is the library documentation. If you are looking for the binary, see the installation instructions on github.
Examples
This crate provides predefined encodings as constants. These constants are
of type Padded
or NoPad
whether they use padding or not. These types
provide encoding and decoding functions with in-place or allocating
variants. Here is an example using the allocating encoding function of
base64:
use data_encoding::BASE64; assert_eq!(BASE64.encode(b"Hello world"), "SGVsbG8gd29ybGQ=");
It is also possible to use the non-padded version of base64 by calling the
no_pad
method of Padded
:
use data_encoding::BASE64; assert_eq!(BASE64.no_pad().encode(b"Hello world"), "SGVsbG8gd29ybGQ");
Here is an example using the in-place decoding function of base32:
use data_encoding::BASE32; let input = b"JBSWY3DPEB3W64TMMQ======"; let mut output = vec![0; BASE32.decode_len(input.len()).unwrap()]; let len = BASE32.decode_mut(input, &mut output).unwrap(); assert_eq!(&output[0 .. len], b"Hello world");
You are not limited to the predefined encodings. You may define your own
encodings (with the same correctness and performance properties as the
predefined ones) using the Builder
type:
use data_encoding::Builder; let hex = Builder::new(b"0123456789abcdef").no_pad().unwrap(); assert_eq!(hex.encode(b"hello"), "68656c6c6f");
Properties
The base16, base32, base32hex, base64, and base64url predefined encodings are conform to RFC4648.
The encoding and decoding functions satisfy the following properties:
- They are deterministic: their output only depends on their input
- They have no side-effects: they do not modify a hidden mutable state
- They are correct: encoding then decoding gives the initial data
- They are canonical (unless non-canonical symbols are used or checking trailing bits is disabled): decoding then encoding gives the initial data
This last property is usually not satisfied by common base64 implementations
(like the rustc-serialize
crate, the base64
crate, or the base64
GNU
program). This is a matter of choice and this crate has made the choice to
let the user choose. Support for canonical encoding as described by the
RFC is provided. But it is also possible to disable checking
trailing bits, to add non-canonical symbols, and to decode concatenated
padded inputs.
Since the RFC specifies the encoding function on all inputs and the decoding
function on all possible encoded outputs, the differences between
implementations come from the decoding function which may be more or less
permissive. In this crate, the decoding function of canonical encodings
rejects all inputs that are not a possible output of the encoding function.
Here are some concrete examples of decoding differences between this crate,
the rustc-serialize
crate, the base64
crate, and the base64
GNU
program:
Input | data-encoding |
rustc |
base64 |
GNU base64 |
---|---|---|---|---|
AAB= |
Trailing(2) |
[0, 0] |
[0, 0] |
\x00\x00 |
AA\nB= |
Length(4) |
[0, 0] |
Err(2) |
\x00\x00 |
AAB |
Length(0) |
[0, 0] |
[0, 0] |
Invalid input |
A\rA\nB= |
Length(4) |
[0, 0] |
Err(1) |
Invalid input |
-_\r\n |
Symbol(0) |
[251] |
Err(0) |
Invalid input |
AA==AA== |
Symbol(2) |
Err |
Err(2) |
\x00\x00 |
We can summarize these discrepancies as follows:
Discrepancy | data-encoding |
rustc |
base64 |
GNU base64 |
---|---|---|---|---|
Non-zero trailing bits | No | Yes | Yes | Yes |
Ignored characters | None | \r and \n |
None | \n |
Translated characters | None | -_ mapped to +/ |
None | None |
Padding is optional | No | Yes | Yes | No |
Concatenated padded input | No | No | No | Yes |
This crate permits to ignore non-zero trailing bits. It permits to translate symbols. It permits to use non-padded encodings. And it also permits to decode concatenated padded inputs. However, it does not permit to ignore characters. This has to be done in a preprocessing stage, as it is done in the binary. Support in the library may be added in future versions.
Migration
The changelog describes the changes between v1 and v2. Here are the migration steps for common usage:
v1 | v2 |
---|---|
use data_encoding::baseNN |
use data_encoding::BASENN |
baseNN::function |
BASENN.method |
baseNN::function_nopad |
BASENN.no_pad().method |
Structs
Array128 |
Convenience wrapper for |
Builder |
Base representation |
BuilderError |
Base building error |
DecodeError |
Decoding error |
NoPad |
Base-conversion encoding (without padding) |
Padded |
Padded base-conversion encoding |
Enums
BitOrder |
Order in which bits are read from a byte |
DecodeKind |
Decoding error kind |
Constants
BASE32 |
RFC4648 base32 encoding |
BASE64 |
RFC4648 base64 encoding |
BASE32HEX |
RFC4648 base32hex encoding |
BASE64URL |
RFC4648 base64url encoding |
HEXLOWER |
Lower-case hexadecimal encoding |
HEXLOWER_PERMISSIVE |
Lower-case permissive hexadecimal encoding |
HEXUPPER |
RFC4648 hex encoding (upper-case hexadecimal encoding) |