Expand description
A library for converting between MUTF-8 and UTF-8.
MUTF-8 is the same as CESU-8 except for its handling of embedded null
characters. This library builds on top of the residua-cesu8 crate found
here.
§Examples
Basic usage
use alloc::borrow::Cow;
let str = "Hello, world!";
// 16-bit Unicode characters are the same in UTF-8 and MUTF-8:
assert_eq!(mutf8::encode(str), Cow::Borrowed(str.as_bytes()));
assert_eq!(mutf8::decode(str.as_bytes()), Ok(Cow::Borrowed(str)));
let str = "\u{10401}";
let mutf8_data = &[0xED, 0xA0, 0x81, 0xED, 0xB0, 0x81];
// 'mutf8_data' is a byte slice containing a 6-byte surrogate pair which
// becomes a 4-byte UTF-8 character.
assert_eq!(mutf8::decode(mutf8_data), Ok(Cow::Owned(str.to_string())));
let str = "\0";
let mutf8_data = vec![0xC0, 0x80];
// 'str' is a null character which becomes a two-byte MUTF-8 representation.
assert_eq!(mutf8::encode(str), Cow::<[u8]>::Owned(mutf8_data));§Features
stdimplementsstd::error::ErroronError. By default, this feature is enabled.
Structs§
Functions§
- decode
- Converts a slice of bytes to a string slice.
- encode
- Converts a string slice to MUTF-8 bytes.
- is_
valid - Returns
trueif a string slice contains UTF-8 data that is also valid MUTF-8. This is mainly used in testing if a string slice needs to be explicitly encoded usingencode. - len
- Given a string slice, this function returns how many bytes in MUTF-8 are required to encode the string slice.