Crate data_encoding [] [src]

This crate provides generic data encoding functions.

Encoding and decoding functions with and without allocation are provided for common bases. Those functions are instantiated from generic functions using a base interface described in module base. The generic encoding and decoding functions are defined in the encode and decode modules respectively.

Examples

use data_encoding::hex;
use data_encoding::base64;
assert_eq!(hex::encode(b"some raw data"), "736F6D65207261772064617461");
assert_eq!(base64::decode(b"c29tZSByYXcgZGF0YQ==").unwrap(), b"some raw data");

A more involved example is available in the examples directory. It is similar to the base64 GNU program, but it works for all common bases and also for custom bases defined at runtime. The make encode command builds this example in target/release/examples/encode.

Conformance

This crate is meant to be conform. The base16, hex, base32, base32hex, base64, and base64url modules conform to RFC 4648.

Properties

This crate is meant to provide strong properties. The encoding and decoding functions satisfy the following properties:

  • They are deterministic: their output only depends on their input.
  • They have no side-effects: they do not modify a hidden mutable state.
  • They never panic, although the decoding function may return a decoding error on invalid input.
  • They are inverse of each other:
    • For all data: Vec<u8>, we have decode(encode(&data).as_bytes()) == Ok(data).
    • For all repr: String, if there is data: Vec<u8> such that decode(repr.as_bytes()) == Ok(data), then encode(&data) == repr.

This last property, that encode and decode are inverse of each other, is usually not satisfied by common base64 implementations, like the rustc-serialize crate or the base64 GNU program. This is a matter of choice, and this crate has made the choice to guarantee canonical encoding as described by section 3.5 of the RFC.

Since the RFC specifies encode on all inputs and decode on all possible encode outputs, the differences between implementations come from the decode function which may be more or less permissive. In this crate, the decode function rejects all inputs that are not a possible output of the encode function. A pre-treatment of the input has to be done to be more permissive (see the example of the examples directory). Here are some concrete examples of decoding differences between this crate, the rustc-serialize crate, and the base64 GNU program:

Input data-encoding rustc-serialize GNU base64
AAB= Err(BadPadding) Ok(vec![0, 0]) \x00\x00
AA\nB= Err(BadLength) Ok(vec![0, 0]) \x00\x00
AAB Err(BadLength) Ok(vec![0, 0]) Invalid input
A\rA\nB= Err(BadLength) Ok(vec![0, 0]) Invalid input
-_\r\n Err(BadCharacter(0)) Ok(vec![251]) Invalid input

We can summarize these discrepancies as follows:

Discrepancy data-encoding rustc-serialize GNU base64
Non-significant bits before padding may be non-null No Yes Yes
Non-alphabet ignored characters None \r and \n \n
Non-alphabet translated characters None -_ mapped to +/ None
Padding is optional No Yes No

This crate may provide wrappers to accept these discrepancies in a generic way at some point in the future.

Performance

This crate is meant to be efficient. It has comparable performance to the rustc-serialize crate and the base64 GNU program. The make bench command runs some benchmarks using cargo and a shell script.

Reexports

pub use base16 as hex;

Modules

base

Generic base module.

base16

Base 16 Encoding.

base2

Base 2 Encoding.

base32

Base 32 Encoding.

base32hex

Base 32 Encoding with Extended Hex Alphabet.

base4

Base 4 Encoding.

base64

Base 64 Encoding.

base64url

Base 64 Encoding with URL and Filename Safe Alphabet.

base8

Base 8 Encoding.

decode

Generic decoding module.

encode

Generic encoding module.