Expand description
A library for converting between MUTF-8 and UTF-8.
MUTF-8 is the same as CESU-8 except for its handling of embedded null
characters. This library builds on top of the residua-cesu8
crate found
here.
Examples
Basic usage
use alloc::borrow::Cow;
let str = "Hello, world!";
// 16-bit Unicode characters are the same in UTF-8 and MUTF-8:
assert_eq!(mutf8::encode(str), Cow::Borrowed(str.as_bytes()));
assert_eq!(mutf8::decode(str.as_bytes()), Ok(Cow::Borrowed(str)));
let str = "\u{10401}";
let mutf8_data = &[0xED, 0xA0, 0x81, 0xED, 0xB0, 0x81];
// 'mutf8_data' is a byte slice containing a 6-byte surrogate pair which
// becomes a 4-byte UTF-8 character.
assert_eq!(mutf8::decode(mutf8_data), Ok(Cow::Owned(str.to_string())));
let str = "\0";
let mutf8_data = vec![0xC0, 0x80];
// 'str' is a null character which becomes a two-byte MUTF-8 representation.
assert_eq!(mutf8::encode(str), Cow::<[u8]>::Owned(mutf8_data));
Features
std
implementsstd::error::Error
onError
. By default, this feature is enabled.
Structs
Functions
Converts a slice of bytes to a string slice.
Converts a string slice to MUTF-8 bytes.
Returns
true
if a string slice contains UTF-8 data that is also valid
MUTF-8. This is mainly used in testing if a string slice needs to be
explicitly encoded using encode
.Given a string slice, this function returns how many bytes in MUTF-8 are
required to encode the string slice.