Struct widestring::ustr::UStr [−][src]
pub struct UStr<C: UChar> { /* fields omitted */ }
Expand description
String slice reference for UString
.
UStr
is to UString
as str
is to String
.
UStr
is not aware of nul values. Strings may or may not be nul-terminated, and may
contain invalid and ill-formed UTF-16 or UTF-32 data. These strings are intended to be used
with FFI functions that directly use string length, where the strings are known to have proper
nul-termination already, or where strings are merely being passed through without modification.
UCStr
should be used instead if nul-aware strings are required.
UStr
can be converted to many other string types, including OsString
and String
, making proper Unicode FFI safe and easy.
Please prefer using the type aliases U16Str
, U32Str
or WideStr
to using this type
directly.
Implementations
Constructs a UStr
from a pointer and a length.
The len
argument is the number of elements, not the number of bytes. No copying or
allocation is performed, the resulting value is a direct reference to the pointer bytes.
Safety
This function is unsafe as there is no guarantee that the given pointer is valid for len
elements.
In addition, the data must meet the safety conditions of std::slice::from_raw_parts.
In particular, the returned string reference must not be mutated for the duration of
lifetime 'a
, except inside an UnsafeCell
.
Panics
This function panics if p
is null.
Caveat
The lifetime for the returned string is inferred from its usage. To prevent accidental misuse, it’s suggested to tie the lifetime to whichever source lifetime is safe in the context, such as by providing a helper function taking the lifetime of a host value for the string, or by explicit annotation.
Constructs a mutable UStr
from a mutable pointer and a length.
The len
argument is the number of elements, not the number of bytes. No copying or
allocation is performed, the resulting value is a direct reference to the pointer bytes.
Safety
This function is unsafe as there is no guarantee that the given pointer is valid for len
elements.
In addition, the data must meet the safety conditions of std::slice::from_raw_parts_mut.
Panics
This function panics if p
is null.
Caveat
The lifetime for the returned string is inferred from its usage. To prevent accidental misuse, it’s suggested to tie the lifetime to whichever source lifetime is safe in the context, such as by providing a helper function taking the lifetime of a host value for the string, or by explicit annotation.
Constructs a UStr
from a slice of character data.
No checks are performed on the slice. It may or may not be valid for its encoding.
Constructs a mutable UStr
from a mutable slice of character data.
No checks are performed on the slice. It may or may not be valid for its encoding.
Copies the string reference to a new owned UString
.
Converts to a mutable slice of the string.
Returns a raw pointer to the string.
The caller must ensure that the string outlives the pointer this function returns, or else it will end up pointing to garbage.
The caller must also ensure that the memory the pointer (non-transitively) points to is
never written to (except inside an UnsafeCell
) using this pointer or any pointer derived
from it. If you need to mutate the contents of the string, use
as_mut_ptr
.
Modifying the container referenced by this string may cause its buffer to be reallocated, which would also make any pointers to it invalid.
Returns an unsafe mutable raw pointer to the string.
The caller must ensure that the string outlives the pointer this function returns, or else it will end up pointing to garbage.
Modifying the container referenced by this string may cause its buffer to be reallocated, which would also make any pointers to it invalid.
Returns the two raw pointers spanning the string slice.
The returned range is half-open, which means that the end pointer points one past the last element of the slice. This way, an empty slice is represented by two equal pointers, and the difference between the two pointers represents the size of the slice.
See as_ptr
for warnings on using these pointers. The end pointer requires
extra caution, as it does not point to a valid element in the slice.
This function is useful for interacting with foreign interfaces which use two pointers to refer to a range of elements in memory, as is common in C++.
Returns the two unsafe mutable pointers spanning the string slice.
The returned range is half-open, which means that the end pointer points one past the last element of the slice. This way, an empty slice is represented by two equal pointers, and the difference between the two pointers represents the size of the slice.
See as_mut_ptr
for warnings on using these pointers. The end pointer requires
extra caution, as it does not point to a valid element in the slice.
This function is useful for interacting with foreign interfaces which use two pointers to refer to a range of elements in memory, as is common in C++.
Returns the length of the string as number of elements (not number of bytes).
Returns an object that implements Display
for printing strings that
may contain non-Unicode data.
A UStr
might contain ill-formed UTF encoding. This struct implements the
Display
trait in a way that decoding the string is lossy but no heap
allocations are performed, such as by to_string_lossy
.
By default, invalid Unicode data is replaced with
U+FFFD REPLACEMENT CHARACTER
(�). If you wish to simply
skip any invalid Uncode data and forego the replacement, you may use the
alternate formatting with {:#}
.
Examples
Basic usage:
use widestring::U16Str;
// 𝄞mus<invalid>ic<invalid>
let s = U16Str::from_slice(&[
0xD834, 0xDD1E, 0x006d, 0x0075, 0x0073, 0xDD1E, 0x0069, 0x0063, 0xD834,
]);
assert_eq!(format!("{}", s.display()),
"𝄞mus�ic�"
);
Using alternate formatting style to skip invalid values entirely:
use widestring::U16Str;
// 𝄞mus<invalid>ic<invalid>
let s = U16Str::from_slice(&[
0xD834, 0xDD1E, 0x006d, 0x0075, 0x0073, 0xDD1E, 0x0069, 0x0063, 0xD834,
]);
assert_eq!(format!("{:#}", s.display()),
"𝄞music"
);
Returns a subslice of the string.
This is the non-panicking alternative to indexing the string. Returns None
whenever
equivalent indexing operation would panic.
Returns a mutable subslice of the string.
This is the non-panicking alternative to indexing the string. Returns None
whenever
equivalent indexing operation would panic.
Returns an unchecked subslice of the string.
This is the unchecked alternative to indexing the string.
Safety
Callers of this function are responsible that these preconditions are satisfied:
- The starting index must not exceed the ending index;
- Indexes must be within bounds of the original slice.
Failing that, the returned string slice may reference invalid memory.
pub unsafe fn get_unchecked_mut<I>(&mut self, i: I) -> &mut Self where
I: SliceIndex<[C], Output = [C]>,
pub unsafe fn get_unchecked_mut<I>(&mut self, i: I) -> &mut Self where
I: SliceIndex<[C], Output = [C]>,
Returns aa mutable, unchecked subslice of the string.
This is the unchecked alternative to indexing the string.
Safety
Callers of this function are responsible that these preconditions are satisfied:
- The starting index must not exceed the ending index;
- Indexes must be within bounds of the original slice.
Failing that, the returned string slice may reference invalid memory.
Divide one string slice into two at an index.
The argument, mid
, should be an offset from the start of the string.
The two slices returned go from the start of the string slice to mid
, and from mid
to
the end of the string slice.
To get mutable string slices instead, see the split_at_mut
method.
Divide one mutable string slice into two at an index.
The argument, mid
, should be an offset from the start of the string.
The two slices returned go from the start of the string slice to mid
, and from mid
to
the end of the string slice.
To get immutable string slices instead, see the split_at
method.
Decodes a string reference to an owned OsString
.
This makes a string copy of the U16Str
. Since U16Str
makes no guarantees that it is
valid UTF-16, there is no guarantee that the resulting OsString
will
be valid encoding either.
Note that the encoding of OsString
is platform-dependent, so on
some platforms this may make an encoding conversions, while on other platforms (such as
windows) no changes to the string will be made.
Examples
use widestring::U16String;
use std::ffi::OsString;
let s = "MyString";
// Create a wide string from the string
let wstr = U16String::from_str(s);
// Create an OsString from the wide string
let osstr = wstr.to_os_string();
assert_eq!(osstr, OsString::from(s));
Decodes the string reference to a String
if it contains valid UTF-16 data.
Failures
Returns an error if the string contains any invalid UTF-16 data.
Examples
use widestring::U16String;
let s = "MyString";
// Create a wide string from the string
let wstr = U16String::from_str(s);
// Create a regular string from the wide string
let s2 = wstr.to_string().unwrap();
assert_eq!(s2, s);
Decodes the string reference to a String
even if it is invalid UTF-16 data.
Any non-Unicode sequences are replaced with U+FFFD REPLACEMENT CHARACTER
.
Examples
use widestring::U16String;
let s = "MyString";
// Create a wide string from the string
let wstr = U16String::from_str(s);
// Create a regular string from the wide string
let lossy = wstr.to_string_lossy();
assert_eq!(lossy, s);
pub fn chars(&self) -> Utf16Chars<'_>ⓘNotable traits for Utf16Chars<'a>impl<'a> Iterator for Utf16Chars<'a> type Item = Result<char, DecodeUtf16Error>;
pub fn chars(&self) -> Utf16Chars<'_>ⓘNotable traits for Utf16Chars<'a>impl<'a> Iterator for Utf16Chars<'a> type Item = Result<char, DecodeUtf16Error>;
impl<'a> Iterator for Utf16Chars<'a> type Item = Result<char, DecodeUtf16Error>;
Returns an iterator over the char
s of a string slice.
As this string slice may consist of invalid UTF-16, the iterator returned by this method
is an iterator over Result<char, DecodeUtf16Error>
instead of char
s
directly. If you would like a lossy iterator over chars
s directly, instead
use chars_lossy
.
It’s important to remember that char
represents a Unicode Scalar Value, and
may not match your idea of what a ‘character’ is. Iteration over grapheme clusters may be
what you actually want. That functionality is not provided by by this crate.
pub fn chars_lossy(&self) -> CharsLossy<'_>ⓘNotable traits for CharsLossy<'a>impl<'a> Iterator for CharsLossy<'a> type Item = char;
pub fn chars_lossy(&self) -> CharsLossy<'_>ⓘNotable traits for CharsLossy<'a>impl<'a> Iterator for CharsLossy<'a> type Item = char;
impl<'a> Iterator for CharsLossy<'a> type Item = char;
Returns a lossy iterator over the char
s of a string slice.
As this string slice may consist of invalid UTF-16, the iterator returned by this method
will replace unpaired surrogates with
U+FFFD REPLACEMENT CHARACTER
(�). This is a lossy
version of chars
.
It’s important to remember that char
represents a Unicode Scalar Value, and
may not match your idea of what a ‘character’ is. Iteration over grapheme clusters may be
what you actually want. That functionality is not provided by by this crate.
pub fn char_indices(&self) -> Utf16CharIndices<'_>ⓘNotable traits for Utf16CharIndices<'a>impl<'a> Iterator for Utf16CharIndices<'a> type Item = (Result<char, DecodeUtf16Error>, usize);
pub fn char_indices(&self) -> Utf16CharIndices<'_>ⓘNotable traits for Utf16CharIndices<'a>impl<'a> Iterator for Utf16CharIndices<'a> type Item = (Result<char, DecodeUtf16Error>, usize);
impl<'a> Iterator for Utf16CharIndices<'a> type Item = (Result<char, DecodeUtf16Error>, usize);
Returns an iterator over the chars of a string slice, and their positions.
As this string slice may consist of invalid UTF-16, the iterator returned by this method
is an iterator over Result<char, DecodeUtf16Error>
as well as their positions, instead of
char
s directly. If you would like a lossy indices iterator over
chars
s directly, instead use
char_indices_lossy
.
The iterator yields tuples. The position is first, the char
is second.
pub fn char_indices_lossy(&self) -> Utf16CharIndicesLossy<'_>ⓘNotable traits for Utf16CharIndicesLossy<'a>impl<'a> Iterator for Utf16CharIndicesLossy<'a> type Item = (char, usize);
pub fn char_indices_lossy(&self) -> Utf16CharIndicesLossy<'_>ⓘNotable traits for Utf16CharIndicesLossy<'a>impl<'a> Iterator for Utf16CharIndicesLossy<'a> type Item = (char, usize);
impl<'a> Iterator for Utf16CharIndicesLossy<'a> type Item = (char, usize);
Returns a lossy iterator over the chars of a string slice, and their positions.
As this string slice may consist of invalid UTF-16, the iterator returned by this method
will replace unpaired surrogates with
U+FFFD REPLACEMENT CHARACTER
(�), as well as the
positions of all characters. This is a lossy version of
char_indices
.
The iterator yields tuples. The position is first, the char
is second.
Constructs a U32Str
from a char
pointer and a length.
The len
argument is the number of char
elements, not the number of bytes. No copying
or allocation is performed, the resulting value is a direct reference to the pointer bytes.
Safety
This function is unsafe as there is no guarantee that the given pointer is valid for len
elements.
In addition, the data must meet the safety conditions of std::slice::from_raw_parts.
In particular, the returned string reference must not be mutated for the duration of
lifetime 'a
, except inside an UnsafeCell
.
Panics
This function panics if p
is null.
Caveat
The lifetime for the returned string is inferred from its usage. To prevent accidental misuse, it’s suggested to tie the lifetime to whichever source lifetime is safe in the context, such as by providing a helper function taking the lifetime of a host value for the string, or by explicit annotation.
Constructs a mutable U32Str
from a mutable char
pointer and a length.
The len
argument is the number of char
elements, not the number of bytes. No copying
or allocation is performed, the resulting value is a direct reference to the pointer bytes.
Safety
This function is unsafe as there is no guarantee that the given pointer is valid for len
elements.
In addition, the data must meet the safety conditions of std::slice::from_raw_parts_mut.
Panics
This function panics if p
is null.
Caveat
The lifetime for the returned string is inferred from its usage. To prevent accidental misuse, it’s suggested to tie the lifetime to whichever source lifetime is safe in the context, such as by providing a helper function taking the lifetime of a host value for the string, or by explicit annotation.
Decodes a string to an owned OsString
.
This makes a string copy of the U32Str
. Since U32Str
makes no guarantees that it is
valid UTF-32, there is no guarantee that the resulting OsString
will
be valid data.
Note that the encoding of OsString
is platform-dependent, so on
some platforms this may make an encoding conversions, while on other platforms no changes to
the string will be made.
Examples
use widestring::U32String;
use std::ffi::OsString;
let s = "MyString";
// Create a wide string from the string
let wstr = U32String::from_str(s);
// Create an OsString from the wide string
let osstr = wstr.to_os_string();
assert_eq!(osstr, OsString::from(s));
Decodes the string to a String
if it contains valid UTF-32 data.
Failures
Returns an error if the string contains any invalid UTF-32 data.
Examples
use widestring::U32String;
let s = "MyString";
// Create a wide string from the string
let wstr = U32String::from_str(s);
// Create a regular string from the wide string
let s2 = wstr.to_string().unwrap();
assert_eq!(s2, s);
Decodes the string reference to a String
even if it is invalid UTF-32 data.
Any non-Unicode sequences are replaced with U+FFFD REPLACEMENT CHARACTER
.
Examples
use widestring::U32String;
let s = "MyString";
// Create a wide string from the string
let wstr = U32String::from_str(s);
// Create a regular string from the wide string
let lossy = wstr.to_string_lossy();
assert_eq!(lossy, s);
pub fn chars(&self) -> Utf32Chars<'_>ⓘNotable traits for Utf32Chars<'a>impl<'a> Iterator for Utf32Chars<'a> type Item = Result<char, DecodeUtf32Error>;
pub fn chars(&self) -> Utf32Chars<'_>ⓘNotable traits for Utf32Chars<'a>impl<'a> Iterator for Utf32Chars<'a> type Item = Result<char, DecodeUtf32Error>;
impl<'a> Iterator for Utf32Chars<'a> type Item = Result<char, DecodeUtf32Error>;
Returns an iterator over the char
s of a string slice.
As this string slice may consist of invalid UTF-32, the iterator returned by this method
is an iterator over Result<char, DecodeUtf32Error>
instead of char
s
directly. If you would like a lossy iterator over chars
s directly, instead
use chars_lossy
.
It’s important to remember that char
represents a Unicode Scalar Value, and
may not match your idea of what a ‘character’ is. Iteration over grapheme clusters may be
what you actually want. That functionality is not provided by by this crate.
pub fn chars_lossy(&self) -> CharsLossy<'_>ⓘNotable traits for CharsLossy<'a>impl<'a> Iterator for CharsLossy<'a> type Item = char;
pub fn chars_lossy(&self) -> CharsLossy<'_>ⓘNotable traits for CharsLossy<'a>impl<'a> Iterator for CharsLossy<'a> type Item = char;
impl<'a> Iterator for CharsLossy<'a> type Item = char;
Returns a lossy iterator over the char
s of a string slice.
As this string slice may consist of invalid UTF-32, the iterator returned by this method
will replace surrogate values or invalid code points with
U+FFFD REPLACEMENT CHARACTER
(�). This is a lossy
version of chars
.
It’s important to remember that char
represents a Unicode Scalar Value, and
may not match your idea of what a ‘character’ is. Iteration over grapheme clusters may be
what you actually want. That functionality is not provided by by this crate.
Trait Implementations
Performs the +=
operation. Read more
Mutably borrows from an owned value. Read more
Extends a collection with the contents of an iterator. Read more
extend_one
)Extends a collection with exactly one element.
extend_one
)Reserves capacity in a collection for the given number of additional elements. Read more
Performs the conversion.
Performs the conversion.
Creates a value from an iterator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more