Crate cesu8[−][src]
Expand description
A library for converting between CESU-8 and UTF-8.
Unicode code points from the Basic Multilingual Plane (BMP), i.e. a code point in the range U+0000 to U+FFFF is encoded in the same way as UTF-8.
If from_cesu8
or to_cesu8
only encounters data that is both
valid CESU-8 and UTF-8 data, the cesu8
crate leverages this using a
clone-on-write smart pointer (Cow). This means that there
are no unnecessary operations and needless allocation of memory:
Examples
use std::borrow::Cow; use cesu8::{from_cesu8, to_cesu8}; let str = "Hello, world!"; assert_eq!(to_cesu8(str), Cow::Borrowed(str.as_bytes())); assert_eq!(from_cesu8(str.as_bytes()), Cow::Borrowed(str));
When data needs to be encoded or decoded, it functions as one might expect:
let str = "\u{10400}"; let cesu8_data = &[0xED, 0xA0, 0x81, 0xED, 0xB0, 0x80]; assert_eq!(from_cesu8(cesu8_data), Cow::Borrowed(str));
Functions
cesu8_len | Returns how many bytes in CESU-8 are required to encode a string slice. |
from_cesu8 | Converts a slice of bytes to a string slice. |
is_valid_cesu8 | Returns |
to_cesu8 | Converts a string slice to CESU-8 bytes. |