Crate ecoji [] [src]

A Rust implementation of Ecoji, a base-1024 encoding with an emoji alphabet.

This crate includes both encoding and decoding functionality, as well as a binary with an interface similar to the base64 tool to perform Ecoji encoding and/or decoding from the command line.

Features

Features of the Ecoji encoding are described in depth in the original implementation's repository. In short, it has the following key characteristics:

  • While Ecoji-encoded strings take more bytes than their base-64 or other ASCII-using counterparts, they take less visible characters. More specifically, each visible character in Ecoji encodes 10 bits of data, while for example each visible character in Base64 encodes 6 bits of data.

  • Ecoji-encoded strings can be concatenated and then decoded, giving the concatenation of the original strings:

    use ecoji::{encode_to_string, decode_to_string};
    
    let (input1, input2) = ("hello ", "world");
    
    // Encode both input strings and concatenate the encoded output
    let output1 = encode_to_string(&mut input1.as_bytes())?;
    let output2 = encode_to_string(&mut input2.as_bytes())?;
    let output = output1 + &output2;
    
    // Then decode the concatenated output
    let input = decode_to_string(&mut output.as_bytes())?;
    
    // The result is the same as concatenation of the input strings
    assert_eq!(input, input1.to_owned() + input2);
  • Data encoded with Ecoji has the same sorting order as the input data:

    use ecoji::{encode_to_string, decode_to_string};
    
    // The input vector is sorted
    let inputs = vec![
        "a", "ab", "abc", "abcd",
        "ac",
        "b", "ba"
    ];
    
    // Encode each element of input and sort the resulting strings again
    let mut outputs: Vec<_> = inputs.iter().cloned()
      .map(|s| encode_to_string(&mut s.as_bytes()))
      .collect::<Result<_, _>>()?;
    outputs.sort_unstable();
    
    // Decode each output item back
    let mut inputs2: Vec<_> = outputs.iter()
      .map(|mut s| decode_to_string(&mut s.as_bytes()))
      .collect::<Result<_, _>>()?;
    let mut inputs2: Vec<_> = inputs2.iter()
      .map(|s| s.as_str())
      .collect();  // to have a Vec<&str> instead of Vec<String> for assert below
    
    // Input (which is sorted) and decoded output (whose source is sorted) should be the same
    assert_eq!(inputs, inputs2);
    

Usage

The two main functions provided by this library are encode and decode, which both have the same signature: they accept a reference to an std::io::Read and a reference to std::io::Write and return an std::io::Result<usize> with the number of bytes written to the output std::io::Write.

Additionally, this library provides shortcut functions, encode_to_string, decode_to_vec and decode_to_string, whose output is an in-memory String or Vec<u8>. Note that there is no need to support special versions of the encode/decode operations which would accept strings or vectors, because slices of bytes (&[u8]) implement the std::io::Read trait by default. Therefore, if you have a string or a byte vector, you can invoke the encoding/decoding functions like this:

let input_1: &str = "some data";
let input_2: &[u8] = b"some data";

// Pass a mutable reference to the intermediate &[u8] object returned by `str::as_bytes()`
let result_1 = ecoji::encode_to_string(&mut input_1.as_bytes())?;

// Pass a mutable reference to a cloned &[u8] object if you already have a byte slice
let result_2 = ecoji::encode_to_string(&mut input_2.clone())?;

Command line tool

This crate also provides an executable binary, ecoji, which provides a command line interface similar to that of the standard base64 command and which can encode or decode data coming on the standard input and write the results of this processing to the standard output. You can install it by invoking the following command:

$ cargo install --bin ecoji --features build-binary ecoji

It will be installed in your default Cargo binaries directory (usually ~/.cargo/bin on Unix systems). Run ecoji --help (assuming the aforementioned directory is in your PATH) to see documentation on how to invoke itl.

Issues and limitations

Currently this crate does not provide an ability to do wrapping of the encoded text, like e.g. what the base64 command does with the -w flag. It is possible that this feature will be implemented in future; pull requests for this functionality are welcome!

This library is almost a direct line-by-line reimplementation of the original algorithm which is implemented in Go. There were almost zero attempts at optimization, therefore performance characteristics may not be stellar. No benchmarking is done either. This is another area where contributions are very welcome.

The core API of this library expects std::io::Read and std::io::Write instances. This implies that the only supported encoding for the emoji output is UTF-8.

Functions

decode

Decodes the entire source from the Ecoji format (assumed to be UTF-8-encoded) and writes the result of the decoding to the provided destination.

decode_to_string

Decodes the entire source from the Ecoji format (assumed to be UTF-8-encoded), storing the result of the decoding to a new owned string.

decode_to_vec

Decodes the entire source from the Ecoji format (assumed to be UTF-8-encoded), storing the result of the decoding to a new byte vector.

encode

Encodes the entire source into the Ecoji format and writes a UTF-8 representation of the encoded data to the provided destination.

encode_to_string

Encodes the entire source into the Ecoji format, storing the result of the encoding to a new owned string.