Struct widestring::ustring::U16String [−][src]
pub struct U16String { /* fields omitted */ }
Expand description
An owned, mutable 16-bit wide string with undefined encoding.
The string slice of a U16String
is U16Str
.
U16String
are strings that do not have a defined encoding. While it is sometimes
assumed that they contain possibly invalid or ill-formed UTF-16 data, they may be used for
any wide encoded string. This is because U16String
is intended to be used with FFI
functions, where proper encoding cannot be guaranteed. If you need string slices that are
always valid UTF-16 strings, use Utf16String
instead.
Because U16String
does not have a defined encoding, no restrictions are placed on
mutating or indexing the string. This means that even if the string contained properly
encoded UTF-16 or other encoding data, mutationing or indexing may result in malformed data.
Convert to a Utf16String
if retaining proper UTF-16 encoding is
desired.
FFI considerations
U16String
is not aware of nul values. Strings may or may not be nul-terminated, and may
contain invalid and ill-formed UTF-16. These strings are intended to be used with FFI functions
that directly use string length, where the strings are known to have proper nul-termination
already, or where strings are merely being passed through without modification.
U16CString
should be used instead if nul-aware strings are required.
Examples
The easiest way to use U16String
outside of FFI is with the u16str!
macro to convert string literals into UTF-16 string slices at compile time:
use widestring::{u16str, U16String};
let hello = U16String::from(u16str!("Hello, world!"));
You can also convert any u16
slice or vector directly:
use widestring::{u16str, U16String};
let sparkle_heart = vec![0xd83d, 0xdc96];
let sparkle_heart = U16String::from_vec(sparkle_heart);
assert_eq!(u16str!("💖"), sparkle_heart);
// This unpaired UTf-16 surrogate is invalid UTF-16, but is perfectly valid in U16String
let malformed_utf16 = vec![0x0, 0xd83d]; // Note that nul values are also valid an untouched
let s = U16String::from_vec(malformed_utf16);
assert_eq!(s.len(), 2);
The following example constructs a U16String
and shows how to convert a U16String
to
a regular Rust String
.
use widestring::U16String;
let s = "Test";
// Create a wide string from the rust string
let wstr = U16String::from_str(s);
// Convert back to a rust string
let rust_str = wstr.to_string_lossy();
assert_eq!(rust_str, "Test");
Implementations
Constructs a wide string from a vector.
No checks are made on the contents of the vector. It may or may not be valid character data.
Examples
use widestring::U16String;
let v = vec![84u16, 104u16, 101u16]; // 'T' 'h' 'e'
// Create a wide string from the vector
let wstr = U16String::from_vec(v);
use widestring::U32String;
let v = vec![84u32, 104u32, 101u32]; // 'T' 'h' 'e'
// Create a wide string from the vector
let wstr = U32String::from_vec(v);
Constructs a wide string copy from a pointer and a length.
The len
argument is the number of elements, not the number of bytes.
Safety
This function is unsafe as there is no guarantee that the given pointer is valid for
len
elements.
In addition, the data must meet the safety conditions of std::slice::from_raw_parts.
Panics
Panics if len
is greater than 0 but p
is a null pointer.
Constructs a wide string with the given capacity.
The string will be able to hold exactly capacity
elements without reallocating.
If capacity
is set to 0, the string will not initially allocate.
Returns the capacity this wide string can hold without reallocating.
Reserves the capacity for at least additional
more capacity to be inserted in the
given wide string.
More space may be reserved to avoid frequent allocations.
Reserves the minimum capacity for exactly additional
more capacity to be inserted
in the given wide string. Does nothing if the capacity is already sufficient.
Note that the allocator may give more space than is requested. Therefore capacity
can not be relied upon to be precisely minimal. Prefer reserve
if
future insertions are expected.
Converts the string into a Vec
, consuming the string in the process.
Converts to a mutable wide string slice.
Returns a mutable reference to the contents of this string.
Extends the string with the given string slice.
No checks are performed on the strings. It is possible to end up nul values inside the string, or invalid encoding, and it is up to the caller to determine if that is acceptable.
Examples
use widestring::U16String;
let s = "MyString";
let mut wstr = U16String::from_str(s);
let cloned = wstr.clone();
// Push the clone to the end, repeating the string twice.
wstr.push(cloned);
assert_eq!(wstr.to_string().unwrap(), "MyStringMyString");
Extends the string with the given slice.
No checks are performed on the strings. It is possible to end up nul values inside the string, or invalid encoding, and it is up to the caller to determine if that is acceptable.
Examples
use widestring::U16String;
let s = "MyString";
let mut wstr = U16String::from_str(s);
let cloned = wstr.clone();
// Push the clone to the end, repeating the string twice.
wstr.push_slice(cloned);
assert_eq!(wstr.to_string().unwrap(), "MyStringMyString");
Shrinks the capacity of the wide string to match its length.
Shrinks the capacity of this string with a lower bound.
The capacity will remain at least as large as both the length and the supplied value.
If the current capacity is less than the lower limit, this is a no-op.
Converts this wide string into a boxed string slice.
Examples
use widestring::{U16String, U16Str};
let s = U16String::from_str("hello");
let b: Box<U16Str> = s.into_boxed_ustr();
Shortens this string to the specified length.
If new_len
is greater than the string’s current length, this has no effect.
Note that this method has no effect on the allocated capacity of the string.
Inserts a string slice into this string at a specified position.
This is an O(n) operation as it requires copying every element in the buffer.
Panics
Panics if idx
is larger than the string’s length.
Splits the string into two at the given index.
Returns a newly allocated string. self
contains values [0, at)
, and the returned
string contains values [at, len)
.
Note that the capacity of self
does not change.
Panics
Panics if at
is equal to or greater than the length of the string.
Retains only the elements specified by the predicate.
In other words, remove all elements e
such that f(e)
returns false
. This
method operates in place, visiting each element exactly once in the original order,
and preserves the order of the retained elements.
Creates a draining iterator that removes the specified range in the string and yields the removed elements.
Note: The element range is removed even if the iterator is not consumed until the end.
Panics
Panics if the starting point or end point are out of bounds.
pub fn replace_range<R>(&mut self, range: R, replace_with: impl AsRef<U16Str>) where
R: RangeBounds<usize>,
pub fn replace_range<R>(&mut self, range: R, replace_with: impl AsRef<U16Str>) where
R: RangeBounds<usize>,
Removes the specified range in the string, and replaces it with the given string.
The given string doesn’t need to be the same length as the range.
Panics
Panics if the starting point or end point are out of bounds.
Constructs a U16String
copy from a str
, encoding it as UTF-16.
This makes a string copy of the str
. Since str
will always be valid UTF-8, the
resulting U16String
will also be valid UTF-16.
Examples
use widestring::U16String;
let s = "MyString";
// Create a wide string from the string
let wstr = U16String::from_str(s);
assert_eq!(wstr.to_string().unwrap(), s);
Constructs a U16String
copy from an OsStr
.
This makes a string copy of the OsStr
. Since OsStr
makes no guarantees that it is valid data, there is no guarantee that the resulting
U16String
will be valid UTF-16.
Note that the encoding of OsStr
is platform-dependent, so on
some platforms this may make an encoding conversions, while on other platforms (such as
windows) no changes to the string will be made.
Examples
use widestring::U16String;
let s = "MyString";
// Create a wide string from the string
let wstr = U16String::from_os_str(s);
assert_eq!(wstr.to_string().unwrap(), s);
Extends the string with the given string slice, encoding it at UTF-16.
No checks are performed on the strings. It is possible to end up nul values inside the string, and it is up to the caller to determine if that is acceptable.
Examples
use widestring::U16String;
let s = "MyString";
let mut wstr = U16String::from_str(s);
// Push the original to the end, repeating the string twice.
wstr.push_str(s);
assert_eq!(wstr.to_string().unwrap(), "MyStringMyString");
Extends the string with the given string slice.
No checks are performed on the strings. It is possible to end up nul values inside the string, and it is up to the caller to determine if that is acceptable.
Examples
use widestring::U16String;
let s = "MyString";
let mut wstr = U16String::from_str(s);
// Push the original to the end, repeating the string twice.
wstr.push_os_str(s);
assert_eq!(wstr.to_string().unwrap(), "MyStringMyString");
Appends the given char
encoded as UTF-16 to the end of this string.
Removes the last character or unpaired surrogate from the string buffer and returns it.
This method assumes UTF-16 encoding, but handles invalid UTF-16 by returning unpaired surrogates.
Returns None
if this String is empty. Otherwise, returns the character cast to a
u32
or the value of the unpaired surrogate as a u32
value.
Removes a char
or unpaired surrogate from this string at a position and
returns it as a u32
.
This method assumes UTF-16 encoding, but handles invalid UTF-16 by returning unpaired surrogates.
This is an O(n) operation, as it requires copying every element in the buffer.
Panics
Panics if idx
is larger than or equal to the string’s length.
Inserts a character encoded as UTF-16 into this string at a specified position.
This is an O(n) operation as it requires copying every element in the buffer.
Panics
Panics if idx
is larger than the string’s length.
Methods from Deref<Target = U16Str>
Copies the string reference to a new owned wide string.
Converts to a slice of the underlying elements of the string.
Converts to a mutable slice of the underlying elements of the string.
Returns a raw pointer to the string.
The caller must ensure that the string outlives the pointer this function returns, or else it will end up pointing to garbage.
The caller must also ensure that the memory the pointer (non-transitively) points to
is never written to (except inside an UnsafeCell
) using this pointer or any
pointer derived from it. If you need to mutate the contents of the string, use
as_mut_ptr
.
Modifying the container referenced by this string may cause its buffer to be reallocated, which would also make any pointers to it invalid.
Returns an unsafe mutable raw pointer to the string.
The caller must ensure that the string outlives the pointer this function returns, or else it will end up pointing to garbage.
Modifying the container referenced by this string may cause its buffer to be reallocated, which would also make any pointers to it invalid.
Returns the two raw pointers spanning the string slice.
The returned range is half-open, which means that the end pointer points one past the last element of the slice. This way, an empty slice is represented by two equal pointers, and the difference between the two pointers represents the size of the slice.
See as_ptr
for warnings on using these pointers. The end pointer
requires extra caution, as it does not point to a valid element in the slice.
This function is useful for interacting with foreign interfaces which use two pointers to refer to a range of elements in memory, as is common in C++.
Returns the two unsafe mutable pointers spanning the string slice.
The returned range is half-open, which means that the end pointer points one past the last element of the slice. This way, an empty slice is represented by two equal pointers, and the difference between the two pointers represents the size of the slice.
See as_mut_ptr
for warnings on using these pointers. The end
pointer requires extra caution, as it does not point to a valid element in the
slice.
This function is useful for interacting with foreign interfaces which use two pointers to refer to a range of elements in memory, as is common in C++.
Returns the length of the string as number of elements (not number of bytes).
Returns an object that implements Display
for printing
strings that may contain non-Unicode data.
This method assumes this string is intended to be UTF-16 encoding, but handles
ill-formed UTF-16 sequences lossily. The returned struct implements
the Display
trait in a way that decoding the string is lossy
UTF-16 decoding but no heap allocations are performed, such as by
to_string_lossy
.
By default, invalid Unicode data is replaced with
U+FFFD REPLACEMENT CHARACTER
(�). If you wish
to simply skip any invalid Uncode data and forego the replacement, you may use the
alternate formatting with {:#}
.
Examples
Basic usage:
use widestring::U16Str;
// 𝄞mus<invalid>ic<invalid>
let s = U16Str::from_slice(&[
0xD834, 0xDD1E, 0x006d, 0x0075, 0x0073, 0xDD1E, 0x0069, 0x0063, 0xD834,
]);
assert_eq!(format!("{}", s.display()),
"𝄞mus�ic�"
);
Using alternate formatting style to skip invalid values entirely:
use widestring::U16Str;
// 𝄞mus<invalid>ic<invalid>
let s = U16Str::from_slice(&[
0xD834, 0xDD1E, 0x006d, 0x0075, 0x0073, 0xDD1E, 0x0069, 0x0063, 0xD834,
]);
assert_eq!(format!("{:#}", s.display()),
"𝄞music"
);
Returns a subslice of the string.
This is the non-panicking alternative to indexing the string. Returns None
whenever equivalent indexing operation would panic.
Returns a mutable subslice of the string.
This is the non-panicking alternative to indexing the string. Returns None
whenever equivalent indexing operation would panic.
pub unsafe fn get_unchecked<I>(&self, i: I) -> &Self where
I: SliceIndex<[u16], Output = [u16]>,
pub unsafe fn get_unchecked<I>(&self, i: I) -> &Self where
I: SliceIndex<[u16], Output = [u16]>,
Returns an unchecked subslice of the string.
This is the unchecked alternative to indexing the string.
Safety
Callers of this function are responsible that these preconditions are satisfied:
- The starting index must not exceed the ending index;
- Indexes must be within bounds of the original slice.
Failing that, the returned string slice may reference invalid memory.
pub unsafe fn get_unchecked_mut<I>(&mut self, i: I) -> &mut Self where
I: SliceIndex<[u16], Output = [u16]>,
pub unsafe fn get_unchecked_mut<I>(&mut self, i: I) -> &mut Self where
I: SliceIndex<[u16], Output = [u16]>,
Returns aa mutable, unchecked subslice of the string.
This is the unchecked alternative to indexing the string.
Safety
Callers of this function are responsible that these preconditions are satisfied:
- The starting index must not exceed the ending index;
- Indexes must be within bounds of the original slice.
Failing that, the returned string slice may reference invalid memory.
Divide one string slice into two at an index.
The argument, mid
, should be an offset from the start of the string.
The two slices returned go from the start of the string slice to mid
, and from
mid
to the end of the string slice.
To get mutable string slices instead, see the split_at_mut
method.
Divide one mutable string slice into two at an index.
The argument, mid
, should be an offset from the start of the string.
The two slices returned go from the start of the string slice to mid
, and from
mid
to the end of the string slice.
To get immutable string slices instead, see the split_at
method.
Decodes a string reference to an owned OsString
.
This makes a string copy of the U16Str
. Since U16Str
makes no guarantees that its
encoding is UTF-16 or that the data valid UTF-16, there is no guarantee that the resulting
OsString
will have a valid underlying encoding either.
Note that the encoding of OsString
is platform-dependent, so on
some platforms this may make an encoding conversions, while on other platforms (such as
windows) no changes to the string will be made.
Examples
use widestring::U16String;
use std::ffi::OsString;
let s = "MyString";
// Create a wide string from the string
let wstr = U16String::from_str(s);
// Create an OsString from the wide string
let osstr = wstr.to_os_string();
assert_eq!(osstr, OsString::from(s));
Decodes this string to a String
if it contains valid UTF-16 data.
This method assumes this string is encoded as UTF-16 and attempts to decode it as such.
Failures
Returns an error if the string contains any invalid UTF-16 data.
Examples
use widestring::U16String;
let s = "MyString";
// Create a wide string from the string
let wstr = U16String::from_str(s);
// Create a regular string from the wide string
let s2 = wstr.to_string().unwrap();
assert_eq!(s2, s);
Decodes the string to a String
even if it is invalid UTF-16 data.
This method assumes this string is encoded as UTF-16 and attempts to decode it as such. Any
invalid sequences are replaced with
U+FFFD REPLACEMENT CHARACTER
, which looks like this:
�
Examples
use widestring::U16String;
let s = "MyString";
// Create a wide string from the string
let wstr = U16String::from_str(s);
// Create a regular string from the wide string
let lossy = wstr.to_string_lossy();
assert_eq!(lossy, s);
pub fn chars(&self) -> CharsUtf16<'_>ⓘNotable traits for CharsUtf16<'a>impl<'a> Iterator for CharsUtf16<'a> type Item = Result<char, DecodeUtf16Error>;
pub fn chars(&self) -> CharsUtf16<'_>ⓘNotable traits for CharsUtf16<'a>impl<'a> Iterator for CharsUtf16<'a> type Item = Result<char, DecodeUtf16Error>;
impl<'a> Iterator for CharsUtf16<'a> type Item = Result<char, DecodeUtf16Error>;
Returns an iterator over the char
s of a string slice.
As this string has no defined encoding, this method assumes the string is UTF-16. Since it
may consist of invalid UTF-16, the iterator returned by this method
is an iterator over Result<char, DecodeUtf16Error>
instead of char
s
directly. If you would like a lossy iterator over chars
s directly, instead
use chars_lossy
.
It’s important to remember that char
represents a Unicode Scalar Value, and
may not match your idea of what a ‘character’ is. Iteration over grapheme clusters may be
what you actually want. That functionality is not provided by by this crate.
pub fn chars_lossy(&self) -> CharsLossyUtf16<'_>ⓘNotable traits for CharsLossyUtf16<'a>impl<'a> Iterator for CharsLossyUtf16<'a> type Item = char;
pub fn chars_lossy(&self) -> CharsLossyUtf16<'_>ⓘNotable traits for CharsLossyUtf16<'a>impl<'a> Iterator for CharsLossyUtf16<'a> type Item = char;
impl<'a> Iterator for CharsLossyUtf16<'a> type Item = char;
Returns a lossy iterator over the char
s of a string slice.
As this string has no defined encoding, this method assumes the string is UTF-16. Since it
may consist of invalid UTF-16, the iterator returned by this method will replace unpaired
surrogates with
U+FFFD REPLACEMENT CHARACTER
(�). This is a lossy
version of chars
.
It’s important to remember that char
represents a Unicode Scalar Value, and
may not match your idea of what a ‘character’ is. Iteration over grapheme clusters may be
what you actually want. That functionality is not provided by by this crate.
pub fn char_indices(&self) -> CharIndicesUtf16<'_>ⓘNotable traits for CharIndicesUtf16<'a>impl<'a> Iterator for CharIndicesUtf16<'a> type Item = (usize, Result<char, DecodeUtf16Error>);
pub fn char_indices(&self) -> CharIndicesUtf16<'_>ⓘNotable traits for CharIndicesUtf16<'a>impl<'a> Iterator for CharIndicesUtf16<'a> type Item = (usize, Result<char, DecodeUtf16Error>);
impl<'a> Iterator for CharIndicesUtf16<'a> type Item = (usize, Result<char, DecodeUtf16Error>);
Returns an iterator over the chars of a string slice, and their positions.
As this string has no defined encoding, this method assumes the string is UTF-16. Since it
may consist of invalid UTF-16, the iterator returned by this method is an iterator over
Result<char, DecodeUtf16Error>
as well as their positions, instead of
char
s directly. If you would like a lossy indices iterator over
chars
s directly, instead use
char_indices_lossy
.
The iterator yields tuples. The position is first, the char
is second.
pub fn char_indices_lossy(&self) -> CharIndicesLossyUtf16<'_>ⓘNotable traits for CharIndicesLossyUtf16<'a>impl<'a> Iterator for CharIndicesLossyUtf16<'a> type Item = (usize, char);
pub fn char_indices_lossy(&self) -> CharIndicesLossyUtf16<'_>ⓘNotable traits for CharIndicesLossyUtf16<'a>impl<'a> Iterator for CharIndicesLossyUtf16<'a> type Item = (usize, char);
impl<'a> Iterator for CharIndicesLossyUtf16<'a> type Item = (usize, char);
Returns a lossy iterator over the chars of a string slice, and their positions.
As this string slice may consist of invalid UTF-16, the iterator returned by this method
will replace unpaired surrogates with
U+FFFD REPLACEMENT CHARACTER
(�), as well as the
positions of all characters. This is a lossy version of
char_indices
.
The iterator yields tuples. The position is first, the char
is second.
Trait Implementations
Performs the +=
operation. Read more
Performs the +=
operation. Read more
Performs the +=
operation. Read more
Performs the +=
operation. Read more
Mutably borrows from an owned value. Read more
Extends a collection with the contents of an iterator. Read more
extend_one
)Extends a collection with exactly one element.
extend_one
)Reserves capacity in a collection for the given number of additional elements. Read more
Extends a collection with the contents of an iterator. Read more
extend_one
)Extends a collection with exactly one element.
extend_one
)Reserves capacity in a collection for the given number of additional elements. Read more
Extends a collection with the contents of an iterator. Read more
extend_one
)Extends a collection with exactly one element.
extend_one
)Reserves capacity in a collection for the given number of additional elements. Read more
Extends a collection with the contents of an iterator. Read more
extend_one
)Extends a collection with exactly one element.
extend_one
)Reserves capacity in a collection for the given number of additional elements. Read more
Extends a collection with the contents of an iterator. Read more
extend_one
)Extends a collection with exactly one element.
extend_one
)Reserves capacity in a collection for the given number of additional elements. Read more
Extends a collection with the contents of an iterator. Read more
extend_one
)Extends a collection with exactly one element.
extend_one
)Reserves capacity in a collection for the given number of additional elements. Read more
Extends a collection with the contents of an iterator. Read more
extend_one
)Extends a collection with exactly one element.
extend_one
)Reserves capacity in a collection for the given number of additional elements. Read more
Extends a collection with the contents of an iterator. Read more
extend_one
)Extends a collection with exactly one element.
extend_one
)Reserves capacity in a collection for the given number of additional elements. Read more
Extends a collection with the contents of an iterator. Read more
extend_one
)Extends a collection with exactly one element.
extend_one
)Reserves capacity in a collection for the given number of additional elements. Read more
Extends a collection with the contents of an iterator. Read more
extend_one
)Extends a collection with exactly one element.
extend_one
)Reserves capacity in a collection for the given number of additional elements. Read more
Extends a collection with the contents of an iterator. Read more
extend_one
)Extends a collection with exactly one element.
extend_one
)Reserves capacity in a collection for the given number of additional elements. Read more
Extends a collection with the contents of an iterator. Read more
extend_one
)Extends a collection with exactly one element.
extend_one
)Reserves capacity in a collection for the given number of additional elements. Read more
Performs the conversion.
Performs the conversion.
Creates a value from an iterator. Read more
Creates a value from an iterator. Read more
Creates a value from an iterator. Read more
Creates a value from an iterator. Read more
Creates a value from an iterator. Read more
Creates a value from an iterator. Read more
Creates a value from an iterator. Read more
Creates a value from an iterator. Read more
Creates a value from an iterator. Read more
Creates a value from an iterator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
Writes a string slice into this writer, returning whether the write succeeded. Read more
Auto Trait Implementations
impl RefUnwindSafe for U16String
impl UnwindSafe for U16String
Blanket Implementations
Mutably borrows from an owned value. Read more