[−][src]Crate utf16string
A UTF-16 little-endian string type.
This crate provides two string types to handle UTF-16 encoded bytes directly as strings:
WString and WStr. They are to UTF-16 exactly like String and str are to
UTF-8. Some of the concepts and functions here are rather tersely documented, in this
case you can look up their equivalents on String or str and the behaviour should
be exactly the same, only the underlying byte encoding is different.
Thus WString is a type which owns the bytes containing the string. Just like
String and the underlying Vec it is built on, it distinguishes length
(WString::len) and capacity (String::capacity). Here length is the number of
bytes used while capacity is the number of bytes the string can grow withouth
reallocating.
The WStr type does not own any bytes, it can only point to a slice of bytes
containing valid UTF-16. As such you will only ever use it as a reference like &WStr,
just you you only use str as &str.
The WString type implements Deref<Target = WStr<ByteOrder>
UTF-16 ByteOrder
UTF-16 encodes to unsigned 16-bit integers (u16), denoting code units. However
different CPU architectures encode these u16 integers using different byte order:
little-endian and big-endian. Thus when handling UTF-16 strings you need to be
aware of the byte order of the encoding, commonly the encoding variants are know as
UTF-16LE and UTF-16BE respectively.
For this crate this means the types need to be aware of the byte order, which is done
using the byteorder::ByteOrder trait as a generic parameter to the types:
WString<ByteOrder> and WStr<ByteOrder> commonly written as WString<E> and
WStr<E> where E stands for "endianess".
This crate exports BigEndian, BE, LittleEndian and LE in case you need
to denote the type:
use utf16string::{BigEndian, BE, WString}; let s0: WString<BigEndian> = WString::from_str("hello"); assert_eq!(s0.len(), 10); let s1: WString<BE> = WString::from_str("hello"); assert_eq!(s0, s1);
As these types can often be a bit cumbersome to write they can often be inferred,
especially with the help of the shorthand constructors like WString::from_utf16le,
WString::from_utf16be, WStr::from_utf16le, WStr::from_utf16be and related.
For example:
use utf16string::{LE, WStr}; let b = b"h\x00e\x00l\x00l\x00o\x00"; let s0: &WStr<LE> = WStr::from_utf16(b)?; let s1 = WStr::from_utf16le(b)?; assert_eq!(s0, s1); assert_eq!(s0.to_utf8(), "hello");
Structs
| Utf16Error | Error for invalid UTF-16 encoded bytes. |
| WStr | A UTF-16 |
| WStrCharIndices | Iterator yielding |
| WStrChars | Iterator yielding |
| WString | A UTF-16 |
Enums
| BigEndian | Defines big-endian serialization. |
| LittleEndian | Defines little-endian serialization. |
Traits
| SliceIndex | Our own version of |
Type Definitions
| BE | A type alias for |
| LE | A type alias for |