Crate data_encoding [−] [src]
This crate provides generic data encoding functions.
Encoding and decoding functions with and without allocation are
provided for common bases. Those functions are instantiated from
generic functions using a base interface described in module
base
. The generic encoding and decoding
functions are defined in the encode
and
decode
modules respectively.
Examples
use data_encoding::hex; use data_encoding::base64; assert_eq!(hex::encode(b"some raw data"), "736F6D65207261772064617461"); assert_eq!(base64::decode(b"c29tZSByYXcgZGF0YQ==").unwrap(), b"some raw data");
A more involved example is available in the examples
directory.
It is similar to the base64
GNU program, but it works for all
common bases and also for custom bases defined at runtime. The
make encode
command builds this example in
target/release/examples/encode
.
Conformance
This crate is meant to be conform. The
base16
, hex
,
base32
,
base32hex
,
base64
, and
base64url
modules conform to RFC
4648.
Properties
This crate is meant to provide strong properties. The encoding and decoding functions satisfy the following properties:
- They are deterministic: their output only depends on their input.
- They have no side-effects: they do not modify a hidden mutable state.
- They never panic, although the decoding function may return a decoding error on invalid input.
- They are inverse of each other:
- For all
data: Vec<u8>
, we havedecode(encode(&data).as_bytes()) == Ok(data)
. - For all
repr: String
, if there isdata: Vec<u8>
such thatdecode(repr.as_bytes()) == Ok(data)
, thenencode(&data) == repr
.
- For all
This last property, that encode
and decode
are inverse of each
other, is usually not satisfied by common base64
implementations, like the rustc-serialize
crate or the base64
GNU program. This is a matter of choice, and this crate has made
the choice to guarantee canonical encoding as described by
section 3.5 of
the RFC.
Since the RFC specifies encode
on all inputs and decode
on all
possible encode
outputs, the differences between implementations
come from the decode
function which may be more or less
permissive. In this crate, the decode
function rejects all
inputs that are not a possible output of the encode
function. A
pre-treatment of the input has to be done to be more permissive
(see the example of the examples
directory). Here are some
concrete examples of decoding differences between this crate, the
rustc-serialize
crate, and the base64
GNU program:
Input | data-encoding |
rustc-serialize |
GNU base64 |
---|---|---|---|
AAB= |
Err(BadPadding) |
Ok(vec![0, 0]) |
\x00\x00 |
AA\nB= |
Err(BadLength) |
Ok(vec![0, 0]) |
\x00\x00 |
AAB |
Err(BadLength) |
Ok(vec![0, 0]) |
Invalid input |
A\rA\nB= |
Err(BadLength) |
Ok(vec![0, 0]) |
Invalid input |
-_\r\n |
Err(BadCharacter(0)) |
Ok(vec![251]) |
Invalid input |
We can summarize these discrepancies as follows:
Discrepancy | data-encoding |
rustc-serialize |
GNU base64 |
---|---|---|---|
Non-significant bits before padding may be non-null | No | Yes | Yes |
Non-alphabet ignored characters | None | \r and \n |
\n |
Non-alphabet translated characters | None | -_ mapped to +/ |
None |
Padding is optional | No | Yes | No |
This crate may provide wrappers to accept these discrepancies in a generic way at some point in the future.
Performance
This crate is meant to be efficient. It has comparable performance
to the rustc-serialize
crate and the base64
GNU program. The
make bench
command runs some benchmarks using cargo and a shell
script.
Reexports
pub use base16 as hex; |
Modules
base |
Generic base module. |
base16 |
Base 16 Encoding. |
base2 |
Base 2 Encoding. |
base32 |
Base 32 Encoding. |
base32hex |
Base 32 Encoding with Extended Hex Alphabet. |
base4 |
Base 4 Encoding. |
base64 |
Base 64 Encoding. |
base64url |
Base 64 Encoding with URL and Filename Safe Alphabet. |
base8 |
Base 8 Encoding. |
decode |
Generic decoding module. |
encode |
Generic encoding module. |