rut 0.3.4

A small UTF-8 parsing library for applications that need to parse individual `char`s.
Documentation

Rut

Rut is a small UTF-8 parsing library for applications that need to parse individual chars.
It provides a byte-wise parsing mechanism, and functions for processing byte slices.

It is completely #[no_std] and should produce very small binaries.[citation needed]

Conformance

Rut aims to be fully conformant to the specifications and restrictions of the Unicode standard.
Due to the nature of byte-wise parsing, some extra caution might be required when using Rut.

The parse_one and parse functions take care of this.

Testing

A few tests validating the expected behavior are already in place, but it is not comprehensive by any means yet. More tests will be added.

I have thrown a fuzzer at it for several minutes, and it passes this stress test for UTF-8 decoders.

Examples

use rut::Utf8Parser;

// UTF-8 encoding of '€'
let bytes = [0xE2, 0x82, 0xAC];

let mut p = Utf8Parser::new();

assert_eq!(p.parse_byte(bytes[0]), Ok(None));
assert_eq!(p.parse_byte(bytes[1]), Ok(None));
assert_eq!(p.parse_byte(bytes[2]), Ok(Some('€')));