base64 0.4.2

encodes and decodes base64 as bytes or utf8

rust-base64

It's base64. What more could anyone want?

Example

In Cargo.toml: base64 = "~0.4.0"

    extern crate base64;

    use base64::{encode, decode};

    fn main() {
        let a = b"hello world";
        let b = "aGVsbG8gd29ybGQ=";

        assert_eq!(encode(a), b);
        assert_eq!(a, &decode(b).unwrap()[..]);
    }

API

NOTE: return types have changed from 0.1.x. decode_ws is deprecated, functionally equivalent to not-yet-implemented MIME mode which will replace it (or perhaps an alternate way of passing options if there is a usecase for whitespace-ignoring UrlSafe).

NOTE: previously (<= 0.2.x) decoding would ignore extraneous padding bytes; this is no longer the case, and such input will produce an InvalidByte error.

rust-base64 exposes seven functions:

    encode(&[u8]) -> String
    decode(&str) -> Result<Vec<u8>, Base64Error>
    encode_config(&[u8], Config) -> String
    decode_config(&str, Config) -> Result<Vec<u8>, Base64Error>
    encode_config_buf(&[u8], Config, &mut String)
    decode_config_buf(&str, Config, &mut Vec<u8>) -> Result<(), Base64Error>
    decode_ws(&str) -> Result<Vec<u8>, Base64Error>

Config supported out of the box are STANDARD, URL_SAFE and URL_SAFE_NO_PAD, which aim to be fully compliant with RFC 4648. MIME mode (RFC 2045) is forthcoming. encode and decode are convenience wrappers for the _config functions called with Base64Mode::Standard, which are themselves wrappers of the _buf functions that allocate. decode_ws does the same as decode after first stripping whitespace ("whitespace" according to the rules of Javascript's btoa(), meaning \n \r \f \t and space). Encode produces valid padding in all cases; decode produces the same output for valid or omitted padding, but errors on invalid (superfluous) padding.

Goals

MIME support, along with replacing the mode enum with config structs. It is unlikely I will add much, if anything, to the feature set beyond that. I'd like to improve on the test cases, confirm full compliance with the standard, and then focus on making it smaller and more performant.

I have a fondness for small dependency footprints, ecosystems where you can pick and choose what functionality you need, and no more. Unix philosophy sort of thing I guess, many tiny utilities interoperating across a common interface. One time making a Twitter bot, I ran into the need to correctly pluralize arbitrary words. I found on npm a module that did nothing but pluralize words. Nothing else, just a couple of functions. I'd like for this to be that "just a couple of functions."

Developing

Benchmarks are in benches/. Running them requires nightly rust, but rustup makes it easy:

rustup run nightly cargo bench

Decoding is aided by some pre-calculated tables, which are generated by:

cargo run --example make_tables > src/tables.rs.tmp && mv src/tables.rs.tmp src/tables.rs

Profiling

On Linux, you can use perf for profiling. First, enable debug symbols in Cargo.toml. Don't commit this change, though, since it's usually not what you want (and costs some performance):

[profile.release]
debug = true

Then compile the benchmarks. (Just re-run them and ^C once the benchmarks start running; all that's needed is to recompile them.)

Run the benchmark binary with perf (shown here filtering to one particular benchmark, which will make the results easier to read). perf is only available to the root user on most systems as it fiddles with event counters in your CPU, so use sudo. We need to run the actual benchmark binary, hence the path into target. You can see the actual full path with rustup run nightly cargo bench -v; it will print out the commands it runs. If you use the exact path that bench outputs, make sure you get the one that's for the benchmarks, not the tests. You may also want to cargo clean so you have only one benchmarks- binary (they tend to accumulate).

sudo perf record target/release/deps/benchmarks-* --bench decode_10mib_reuse

Then analyze the results, again with perf:

sudo perf annnotate -l

You'll see a bunch of interleaved rust source and assembly like this. The section with lib.rs:327 is telling us that 4.02% of samples saw the movzbl aka bit shift as the active instruction. However, this percentage is not as exact as it seems due to a phenomenon called skid. Basically, a consequence of how fancy modern CPUs are is that this sort of instruction profiling is inherently inaccurate, especially in branch-heavy code.

 lib.rs:322    0.70 :     10698:       mov    %rdi,%rax
    2.82 :        1069b:       shr    $0x38,%rax
         :                  if morsel == decode_tables::INVALID_VALUE {
         :                      bad_byte_index = input_index;
         :                      break;
         :                  };
         :                  accum = (morsel as u64) << 58;
 lib.rs:327    4.02 :     1069f:       movzbl (%r9,%rax,1),%r15d
         :              // fast loop of 8 bytes at a time
         :              while input_index < length_of_full_chunks {
         :                  let mut accum: u64;
         :
         :                  let input_chunk = BigEndian::read_u64(&input_bytes[input_index..(input_index + 8)]);
         :                  morsel = decode_table[(input_chunk >> 56) as usize];
 lib.rs:322    3.68 :     106a4:       cmp    $0xff,%r15
         :                  if morsel == decode_tables::INVALID_VALUE {
    0.00 :        106ab:       je     1090e <base64::decode_config_buf::hbf68a45fefa299c1+0x46e>