unicode_converter 0.1.2

A library and a CLI tool to convert data between various Unicode encodings.
Documentation
# Unicode converter

This repository contains both a library and a CLI tool to convert data between various Unicode encodings.

The supported encodings are:

* UTF-8
* CESU-8
* UTF-16
* UTF-32
* UTF-1

## CLI tool

The CLI tool is meant to be a demonstration of the library but it can be used on its own if needed. It is made in a single file, `str/main.rs`.

### Usage

```
A tool to convert Unicode text files between multiple Unicode encodings. The available encodings are
UTF-8, UTF-1, CESU-8, UTF-16, and UTF-32. By default, the data is assumed to be little-endian, but for encodings
with multi-byte words such as UTF-16 or UTF-32, you can add the `_be` suffix to indicate that you
want to work with big-endian data

USAGE:
    unicode_converter [OPTIONS] --input-file <INPUT_FILE> --decoding-input <DECODING_INPUT> --encoding-output <ENCODING_OUTPUT>

OPTIONS:
    -d, --decoding-input <DECODING_INPUT>
            Input file encoding

    -e, --encoding-output <ENCODING_OUTPUT>
            Output file encoding

    -h, --help
            Print help information

    -i, --input-file <INPUT_FILE>
            Input file used as input. You can use `-` if you mean `/dev/stdin`

    -o, --output-file <OUTPUT_FILE>
            Output file [default: /dev/stdout]
```

### Compilation

To compile it, simply run `cargo build` as it is the only executable crate in this repository.

## Library

All the code in `src/` except for `src/main.rs` makes the Unicode encoding converting library.

### Behavior

The various Unicode encodings are all made with their own type implementing the `UnicodeEncoding` trait. Running `cargo doc` will give you complete information but the intended way of using the library is the following:

* Read data from a file or a slice of bytes. For example, too read UTF-16 data from a file, do `let content = Utf16::from_file("filename.txt", false).unwrap();`. Note the `false` used to indicate that the encoding is little-endian.
* Then, convert it to an other encoding. For example, to convert to UTF-8: `let converted = content.convert_to::<Utf8>();`.
* Finally, you can write the converted data to a new file. `converted.to_file("new_file.txt", false);`. As UTF-8 is only on one byte, the boolean argument to take care of the endianess is ignored.