Crate mutf8[][src]

Expand description

A library for converting between MUTF-8 and UTF-8.

MUTF-8 is the same as CESU-8 except for its handling of embedded null characters. This library builds on top of the residua-cesu8 crate found here.

use std::borrow::Cow;
use mutf8::{to_mutf8, from_mutf8};

let str = "Hello, world!";
// 16-bit Unicode characters are the same in UTF-8 and MUTF-8:
assert_eq!(to_mutf8(str), Cow::Borrowed(str.as_bytes()));
assert_eq!(from_mutf8(str.as_bytes()), Cow::Borrowed(str));

let str = "\u{10401}";
let mutf8_data = &[0xED, 0xA0, 0x81, 0xED, 0xB0, 0x81];
// 'mutf8_data' is a byte slice containing a 6-byte surrogate pair which
// becomes a 4-byte UTF-8 character.
assert_eq!(from_mutf8(mutf8_data), Cow::Borrowed(str));

let str = "\0";
let mutf8_data = &[0xC0, 0x80];
// 'str' is a null character which becomes a two-byte MUTF-8 representation.
assert_eq!(to_mutf8(str), Cow::Borrowed(mutf8_data))

Functions

from_mutf8

Converts a slice of bytes to a string slice.

is_valid_mutf8

Returns true if a string slice contains UTF-8 data that is also valid MUTF-8. This is mainly used in testing if a string slice needs to be explicitly encoded using to_mutf8.

mutf8_len

Given a string slice, this function returns how many bytes in MUTF-8 are required to encode the string slice.

to_mutf8

Converts a string slice to MUTF-8 bytes.