Crate data_encoding [] [src]

Correct, efficient, canonical, and generic data-encoding functions

This crate provides little-endian ASCII base-conversion encodings for bases of size 2, 4, 8, 16, 32, and 64. It supports both padded and non-padded encodings. It supports canonical encodings (trailing bits are checked). It supports in-place encoding and decoding functions. It supports non-canonical symbols. And it supports both most and least significant bit-order. The performance of the encoding and decoding functions are similar to existing implementations (see how to run the benchmarks on github).

This is the library documentation. If you are looking for the binary, see the installation instructions on github.

Examples

This crate provides predefined encodings as constants. These constants are of type Padded or NoPad whether they use padding or not. These types provide encoding and decoding functions with in-place or allocating variants. Here is an example using the allocating encoding function of base64:

use data_encoding::BASE64;
assert_eq!(BASE64.encode(b"Hello world"), "SGVsbG8gd29ybGQ=");

It is also possible to use the non-padded version of base64 by calling the no_pad method of Padded:

use data_encoding::BASE64;
assert_eq!(BASE64.no_pad().encode(b"Hello world"), "SGVsbG8gd29ybGQ");

Here is an example using the in-place decoding function of base32:

use data_encoding::BASE32;
let input = b"JBSWY3DPEB3W64TMMQ======";
let mut output = vec![0; BASE32.decode_len(input.len()).unwrap()];
let len = BASE32.decode_mut(input, &mut output).unwrap();
assert_eq!(&output[0 .. len], b"Hello world");

You are not limited to the predefined encodings. You may define your own encodings (with the same correctness and performance properties as the predefined ones) using the Builder type:

use data_encoding::Builder;
let hex = Builder::new(b"0123456789abcdef").no_pad().unwrap();
assert_eq!(hex.encode(b"hello"), "68656c6c6f");

Properties

The base16, base32, base32hex, base64, and base64url predefined encodings are conform to RFC4648.

The encoding and decoding functions satisfy the following properties:

  • They are deterministic: their output only depends on their input
  • They have no side-effects: they do not modify a hidden mutable state
  • They are correct: encoding then decoding gives the initial data
  • They are canonical (unless non-canonical symbols are used or checking trailing bits is disabled): decoding then encoding gives the initial data

This last property is usually not satisfied by common base64 implementations (like the rustc-serialize crate, the base64 crate, or the base64 GNU program). This is a matter of choice and this crate has made the choice to let the user choose. Support for canonical encoding as described by the RFC is provided. But it is also possible to disable checking trailing bits, to add non-canonical symbols, and to decode concatenated padded inputs.

Since the RFC specifies the encoding function on all inputs and the decoding function on all possible encoded outputs, the differences between implementations come from the decoding function which may be more or less permissive. In this crate, the decoding function of canonical encodings rejects all inputs that are not a possible output of the encoding function. Here are some concrete examples of decoding differences between this crate, the rustc-serialize crate, the base64 crate, and the base64 GNU program:

Input data-encoding rustc base64 GNU base64
AAB= Trailing(2) [0, 0] [0, 0] \x00\x00
AA\nB= Length(4) [0, 0] Err(2) \x00\x00
AAB Length(0) [0, 0] [0, 0] Invalid input
A\rA\nB= Length(4) [0, 0] Err(1) Invalid input
-_\r\n Symbol(0) [251] Err(0) Invalid input
AA==AA== Symbol(2) Err Err(2) \x00\x00

We can summarize these discrepancies as follows:

Discrepancy data-encoding rustc base64 GNU base64
Non-zero trailing bits No Yes Yes Yes
Ignored characters None \r and \n None \n
Translated characters None -_ mapped to +/ None None
Padding is optional No Yes Yes No
Concatenated padded input No No No Yes

This crate permits to ignore non-zero trailing bits. It permits to translate symbols. It permits to use non-padded encodings. And it also permits to decode concatenated padded inputs. However, it does not permit to ignore characters. This has to be done in a preprocessing stage, as it is done in the binary. Support in the library may be added in future versions.

Migration

The changelog describes the changes between v1 and v2. Here are the migration steps for common usage:

v1 v2
use data_encoding::baseNN use data_encoding::BASENN
baseNN::function BASENN.method
baseNN::function_nopad BASENN.no_pad().method

Structs

Array128

Convenience wrapper for [u8; 128]

Builder

Base representation

BuilderError

Base building error

DecodeError

Decoding error

NoPad

Base-conversion encoding (without padding)

Padded

Padded base-conversion encoding

Enums

BitOrder

Order in which bits are read from a byte

DecodeKind

Decoding error kind

Constants

BASE32

RFC4648 base32 encoding

BASE64

RFC4648 base64 encoding

BASE32HEX

RFC4648 base32hex encoding

BASE64URL

RFC4648 base64url encoding

HEXLOWER

Lower-case hexadecimal encoding

HEXLOWER_PERMISSIVE

Lower-case permissive hexadecimal encoding

HEXUPPER

RFC4648 hex encoding (upper-case hexadecimal encoding)