Struct widestring::utfstr::Utf32Str [−][src]
pub struct Utf32Str { /* fields omitted */ }
Expand description
UTF-32 string slice for Utf32String
.
Utf32Str
is to Utf32String
as str
is to String
.
Utf32Str
slices are string slices that are always valid UTF-32 encoding. This is unlike
the U32Str
string slices, which may not have valid encoding. In this way,
Utf32Str
string slices most resemble native str
slices of all the types in this
crate.
Examples
The easiest way to use Utf32Str
is with the utf32str!
macro to
convert string literals into string slices at compile time:
use widestring::utf32str;
let hello = utf32str!("Hello, world!");
You can also convert a u32
slice directly, provided it is valid UTF-32:
use widestring::Utf32Str;
let sparkle_heart = [0x1f496];
let sparkle_heart = Utf32Str::from_slice(&sparkle_heart).unwrap();
assert_eq!("💖", sparkle_heart);
Since char
slices are valid UTF-32, a slice of char
s can be easily converted to a
string slice:
use widestring::Utf32Str;
let sparkle_heart = ['💖'; 3];
let sparkle_heart = Utf32Str::from_char_slice(&sparkle_heart);
assert_eq!("💖💖💖", sparkle_heart);
Implementations
Converts a slice to a string slice without checking that the string contains valid UTF-32.
See the safe version, from_slice
, for more information.
Safety
This function is unsafe because it does not check that the slice passed to it is valid
UTF-32. If this constraint is violated, undefined behavior results as it is assumed the
Utf32Str
is always valid UTF-32.
Examples
use widestring::Utf32Str;
let sparkle_heart = vec![0x1f496];
let sparkle_heart = unsafe { Utf32Str::from_slice_unchecked(&sparkle_heart) };
assert_eq!("💖", sparkle_heart);
Converts a mutable slice to a mutable string slice without checking that the string contains valid UTF-32.
See the safe version, from_slice_mut
, for more information.
Safety
This function is unsafe because it does not check that the slice passed to it is valid
UTF-32. If this constraint is violated, undefined behavior results as it is assumed the
Utf32Str
is always valid UTF-32.
Examples
use widestring::Utf32Str;
let mut sparkle_heart = vec![0x1f496];
let sparkle_heart = unsafe { Utf32Str::from_slice_unchecked_mut(&mut sparkle_heart) };
assert_eq!("💖", sparkle_heart);
pub unsafe fn get_unchecked<I>(&self, index: I) -> &Self where
I: SliceIndex<[u32], Output = [u32]>,
pub unsafe fn get_unchecked<I>(&self, index: I) -> &Self where
I: SliceIndex<[u32], Output = [u32]>,
Returns an unchecked subslice of this string slice.
This is the unchecked alternative to indexing the string slice.
Safety
Callers of this function are responsible that these preconditions are satisfied:
- The starting index must not exceed the ending index;
- Indexes must be within bounds of the original slice;
Failing that, the returned string slice may reference invalid memory or violate the invariants communicated by the type.
Examples
let v = utf32str!("⚧️🏳️⚧️➡️s");
unsafe {
assert_eq!(utf32str!("⚧️"), v.get_unchecked(..2));
assert_eq!(utf32str!("🏳️⚧️"), v.get_unchecked(2..7));
assert_eq!(utf32str!("➡️"), v.get_unchecked(7..9));
assert_eq!(utf32str!("s"), v.get_unchecked(9..))
}
pub unsafe fn get_unchecked_mut<I>(&mut self, index: I) -> &mut Self where
I: SliceIndex<[u32], Output = [u32]>,
pub unsafe fn get_unchecked_mut<I>(&mut self, index: I) -> &mut Self where
I: SliceIndex<[u32], Output = [u32]>,
Returns a mutable, unchecked subslice of this string slice
This is the unchecked alternative to indexing the string slice.
Safety
Callers of this function are responsible that these preconditions are satisfied:
- The starting index must not exceed the ending index;
- Indexes must be within bounds of the original slice;
Failing that, the returned string slice may reference invalid memory or violate the invariants communicated by the type.
Examples
let mut v = utf32str!("⚧️🏳️⚧️➡️s").to_owned();
unsafe {
assert_eq!(utf32str!("⚧️"), v.get_unchecked_mut(..2));
assert_eq!(utf32str!("🏳️⚧️"), v.get_unchecked_mut(2..7));
assert_eq!(utf32str!("➡️"), v.get_unchecked_mut(7..9));
assert_eq!(utf32str!("s"), v.get_unchecked_mut(9..))
}
Returns the length of self
.
This length is in the number of char
s in the slice, not graphemes. In other words, it
may not be what human considers the length of the string.
Examples
assert_eq!(utf32str!("foo").len(), 3);
let complex = utf32str!("⚧️🏳️⚧️➡️s");
assert_eq!(complex.len(), 10);
assert_eq!(complex.chars().count(), 10);
Converts a string to a slice of its underlying elements.
To convert the slice back into a string slice, use the
from_slice
function.
Converts a mutable string to a mutable slice of its underlying elements.
Safety
This function is unsafe because you can violate the invariants of this type when mutating the slice. The caller must ensure that the contents of the slice is valid UTF before the borrow ends and the underlying string is used.
Use of this string type whose contents have been mutated to invalid UTF is undefined behavior.
Converts a string slice to a raw pointer.
This pointer will be pointing to the first element of the string slice.
The caller must ensure that the returned pointer is never written to. If you need to
mutate the contents of the string slice, use as_mut_ptr
.
Converts a mutable string slice to a mutable pointer.
This pointer will be pointing to the first element of the string slice.
Safety
This function is unsafe because you can violate the invariants of this type when mutating the slice. The caller must ensure that the contents of the slice is valid UTF before the borrow ends and the underlying string is used.
Use of this string type whose contents have been mutated to invalid UTF is undefined behavior.
Returns this string as a wide string slice of undefined encoding.
Returns a string slice with leading and trailing whitespace removed.
‘Whitespace’ is defined according to the terms of the Unicode Derived Core Property
White_Space
.
Returns a string slice with leading whitespace removed.
‘Whitespace’ is defined according to the terms of the Unicode Derived Core Property
White_Space
.
Text directionality
A string is a sequence of elements. start
in this context means the first position
of that sequence; for a left-to-right language like English or Russian, this will be
left side, and for right-to-left languages like Arabic or Hebrew, this will be the
right side.
Returns a string slice with trailing whitespace removed.
‘Whitespace’ is defined according to the terms of the Unicode Derived Core Property
White_Space
.
Text directionality
A string is a sequence of elements. end
in this context means the last position of
that sequence; for a left-to-right language like English or Russian, this will be
right side, and for right-to-left languages like Arabic or Hebrew, this will be the
left side.
Converts a boxed string into a boxed slice without copying or allocating.
Converts a boxed string slice into an owned UTF string without copying or allocating.
Creates a new owned string by repeating this string n
times.
Panics
This function will panic if the capacity would overflow.
Converts a slice of UTF-32 data to a string slice.
Not all slices of u32
values are valid to convert, since Utf32Str
requires that it
is always valid UTF-32. This function checks to ensure that the values are valid UTF-32, and
then does the conversion.
If you are sure that the slice is valid UTF-32, and you don’t want to incur the overhead of
the validity check, there is an unsafe version of this function,
from_slice_unchecked
, which has the same behavior but skips
the check.
If you need an owned string, consider using Utf32String::from_vec
instead.
Because you can stack-allocate a [u32; N]
, this function is one way to have a
stack-allocated string. Indeed, the utf32str!
macro does exactly this
after converting from UTF-8 to UTF-32.
Errors
Returns an error if the slice is not UTF-32 with a description as to why the provided slice is not UTF-32.
Examples
use widestring::Utf32Str;
let sparkle_heart = vec![0x1f496];
let sparkle_heart = Utf32Str::from_slice(&sparkle_heart).unwrap();
assert_eq!("💖", sparkle_heart);
With incorrect values that return an error:
use widestring::Utf32Str;
let sparkle_heart = vec![0xd83d, 0xdc96]; // UTF-16 surrogates are invalid
assert!(Utf32Str::from_slice(&sparkle_heart).is_err());
Converts a mutable slice of UTF-32 data to a mutable string slice.
Not all slices of u32
values are valid to convert, since Utf32Str
requires that it
is always valid UTF-32. This function checks to ensure that the values are valid UTF-32, and
then does the conversion.
If you are sure that the slice is valid UTF-32, and you don’t want to incur the overhead of
the validity check, there is an unsafe version of this function,
from_slice_unchecked_mut
, which has the same behavior
but skips the check.
If you need an owned string, consider using Utf32String::from_vec
instead.
Because you can stack-allocate a [u32; N]
, this function is one way to have a
stack-allocated string. Indeed, the utf32str!
macro does exactly this
after converting from UTF-8 to UTF-32.
Errors
Returns an error if the slice is not UTF-32 with a description as to why the provided slice is not UTF-32.
Examples
use widestring::Utf32Str;
let mut sparkle_heart = vec![0x1f496];
let sparkle_heart = Utf32Str::from_slice_mut(&mut sparkle_heart).unwrap();
assert_eq!("💖", sparkle_heart);
With incorrect values that return an error:
use widestring::Utf32Str;
let mut sparkle_heart = vec![0xd83d, 0xdc96]; // UTF-16 surrogates are invalid
assert!(Utf32Str::from_slice_mut(&mut sparkle_heart).is_err());
Converts a wide string slice of undefined encoding to a UTF-32 string slice without checking if the string slice is valid UTF-32.
See the safe version, from_ustr
, for more information.
Safety
This function is unsafe because it does not check that the string slice passed to it is
valid UTF-32. If this constraint is violated, undefined behavior results as it is assumed
the Utf32Str
is always valid UTF-32.
Examples
use widestring::{Utf32Str, u32str};
let sparkle_heart = u32str!("💖");
let sparkle_heart = unsafe { Utf32Str::from_ustr_unchecked(sparkle_heart) };
assert_eq!("💖", sparkle_heart);
Converts a mutable wide string slice of undefined encoding to a mutable UTF-32 string slice without checking if the string slice is valid UTF-32.
See the safe version, from_ustr_mut
, for more information.
Safety
This function is unsafe because it does not check that the string slice passed to it is
valid UTF-32. If this constraint is violated, undefined behavior results as it is assumed
the Utf32Str
is always valid UTF-32.
Converts a wide string slice of undefined encoding to a UTF-32 string slice.
Since U32Str
does not have a specified encoding, this conversion may fail if the
U32Str
does not contain valid UTF-32 data.
If you are sure that the slice is valid UTF-32, and you don’t want to incur the overhead of
the validity check, there is an unsafe version of this function,
from_ustr_unchecked
, which has the same behavior
but skips the check.
Errors
Returns an error if the string slice is not UTF-32 with a description as to why the provided string slice is not UTF-32.
Examples
use widestring::{Utf32Str, u32str};
let sparkle_heart = u32str!("💖");
let sparkle_heart = Utf32Str::from_ustr(sparkle_heart).unwrap();
assert_eq!("💖", sparkle_heart);
Converts a mutable wide string slice of undefined encoding to a mutable UTF-32 string slice.
Since U32Str
does not have a specified encoding, this conversion may fail if the
U32Str
does not contain valid UTF-32 data.
If you are sure that the slice is valid UTF-32, and you don’t want to incur the overhead of
the validity check, there is an unsafe version of this function,
from_ustr_unchecked_mut
, which has the same behavior
but skips the check.
Errors
Returns an error if the string slice is not UTF-32 with a description as to why the provided string slice is not UTF-32.
Converts a wide C string slice to a UTF-32 string slice without checking if the string slice is valid UTF-32.
The resulting string slice does not contain the nul terminator.
See the safe version, from_ucstr
, for more information.
Safety
This function is unsafe because it does not check that the string slice passed to it is
valid UTF-32. If this constraint is violated, undefined behavior results as it is assumed
the Utf32Str
is always valid UTF-32.
Examples
use widestring::{Utf32Str, u32cstr};
let sparkle_heart = u32cstr!("💖");
let sparkle_heart = unsafe { Utf32Str::from_ucstr_unchecked(sparkle_heart) };
assert_eq!("💖", sparkle_heart);
Converts a mutable wide C string slice to a mutable UTF-32 string slice without checking if the string slice is valid UTF-32.
The resulting string slice does not contain the nul terminator.
See the safe version, from_ucstr_mut
, for more information.
Safety
This function is unsafe because it does not check that the string slice passed to it is
valid UTF-32. If this constraint is violated, undefined behavior results as it is assumed
the Utf32Str
is always valid UTF-32.
Converts a wide C string slice to a UTF-32 string slice.
The resulting string slice does not contain the nul terminator.
Since U32CStr
does not have a specified encoding, this conversion may
fail if the U32CStr
does not contain valid UTF-32 data.
If you are sure that the slice is valid UTF-32, and you don’t want to incur the overhead of
the validity check, there is an unsafe version of this function,
from_ucstr_unchecked
, which has the same behavior
but skips the check.
Errors
Returns an error if the string slice is not UTF-32 with a description as to why the provided string slice is not UTF-32.
Examples
use widestring::{Utf32Str, u32cstr};
let sparkle_heart = u32cstr!("💖");
let sparkle_heart = Utf32Str::from_ucstr(sparkle_heart).unwrap();
assert_eq!("💖", sparkle_heart);
Converts a mutable wide C string slice to a mutable UTF-32 string slice.
The resulting string slice does not contain the nul terminator.
Since U32CStr
does not have a specified encoding, this conversion may
fail if the U32CStr
does not contain valid UTF-32 data.
If you are sure that the slice is valid UTF-32, and you don’t want to incur the overhead of
the validity check, there is an unsafe version of this function,
from_ucstr_unchecked_mut
, which has the same behavior
but skips the check.
Safety
This method is unsafe because you can violate the invariants of U16CStr
when mutating the slice (i.e. by adding interior nul values).
Errors
Returns an error if the string slice is not UTF-32 with a description as to why the provided string slice is not UTF-32.
Converts a slice of char
s to a string slice.
Since char
slices are always valid UTF-32, this conversion always suceeds.
If you need an owned string, consider using Utf32String::from_chars
instead.
Examples
use widestring::Utf32Str;
let sparkle_heart = ['💖'];
let sparkle_heart = Utf32Str::from_char_slice(&sparkle_heart);
assert_eq!("💖", sparkle_heart);
Converts a mutable slice of char
s to a string slice.
Since char
slices are always valid UTF-32, this conversion always suceeds.
If you need an owned string, consider using Utf32String::from_chars
instead.
Examples
use widestring::Utf32Str;
let mut sparkle_heart = ['💖'];
let sparkle_heart = Utf32Str::from_char_slice_mut(&mut sparkle_heart);
assert_eq!("💖", sparkle_heart);
Converts a string slice into a slice of char
s.
Converts a mutable string slice into a mutable slice of char
s.
Converts to a standard UTF-8 String
.
Because this string is always valid UTF-32, the conversion is lossless and non-fallible.
Returns a subslice of this string.
This is the non-panicking alternative to indexing the string. Returns None
whenever
equivalent indexing operation would panic.
Examples
let v = utf32str!("⚧️🏳️⚧️➡️s");
assert_eq!(Some(utf32str!("⚧️")), v.get(..2));
assert_eq!(Some(utf32str!("🏳️⚧️")), v.get(2..7));
assert_eq!(Some(utf32str!("➡️")), v.get(7..9));
assert_eq!(Some(utf32str!("s")), v.get(9..));
Returns a mutable subslice of this string.
This is the non-panicking alternative to indexing the string. Returns None
whenever
equivalent indexing operation would panic.
Examples
let mut v = utf32str!("⚧️🏳️⚧️➡️s").to_owned();
assert_eq!(utf32str!("⚧️"), v.get_mut(..2).unwrap());
assert_eq!(utf32str!("🏳️⚧️"), v.get_mut(2..7).unwrap());
assert_eq!(utf32str!("➡️"), v.get_mut(7..9).unwrap());
assert_eq!(utf32str!("s"), v.get_mut(9..).unwrap());
Divide one string slice into two at an index.
The argument, mid
, should be an offset from the start of the string.
The two slices returned go from the start of the string slice to mid
, and from mid
to
the end of the string slice.
To get mutable string slices instead, see the split_at_mut
method.
Panics
Panics if mid
is past the end of the last code point of the string slice.
Examples
let s = utf32str!("Per Martin-Löf");
let (first, last) = s.split_at(3);
assert_eq!("Per", first);
assert_eq!(" Martin-Löf", last);
Divide one mutable string slice into two at an index.
The argument, mid
, should be an offset from the start of the string.
The two slices returned go from the start of the string slice to mid
, and from mid
to
the end of the string slice.
To get immutable string slices instead, see the split_at
method.
Panics
Panics if mid
is past the end of the last code point of the string slice.
Examples
let mut s = utf32str!("Per Martin-Löf").to_owned();
let (first, last) = s.split_at_mut(3);
assert_eq!("Per", first);
assert_eq!(" Martin-Löf", last);
pub fn chars(&self) -> CharsUtf32<'_>ⓘNotable traits for CharsUtf32<'a>impl<'a> Iterator for CharsUtf32<'a> type Item = char;
pub fn chars(&self) -> CharsUtf32<'_>ⓘNotable traits for CharsUtf32<'a>impl<'a> Iterator for CharsUtf32<'a> type Item = char;
impl<'a> Iterator for CharsUtf32<'a> type Item = char;
Returns an iterator over the char
s of a string slice.
As this string slice consists of valid UTF-32, we can iterate through a string slice by
char
. This method returns such an iterator.
It’s important to remember that char
represents a Unicode Scalar Value, and might not
match your idea of what a ‘character’ is. Iteration over grapheme clusters may be what you
actually want. This functionality is not provided by this crate.
pub fn char_indices(&self) -> CharIndicesUtf32<'_>ⓘNotable traits for CharIndicesUtf32<'a>impl<'a> Iterator for CharIndicesUtf32<'a> type Item = (usize, char);
pub fn char_indices(&self) -> CharIndicesUtf32<'_>ⓘNotable traits for CharIndicesUtf32<'a>impl<'a> Iterator for CharIndicesUtf32<'a> type Item = (usize, char);
impl<'a> Iterator for CharIndicesUtf32<'a> type Item = (usize, char);
Returns an iterator over the char
s of a string slice and their positions.
As this string slice consists of valid UTF-32, we can iterate through a string slice by
char
. This method returns an iterator of both these char
s as well as their offsets.
The iterator yields tuples. The position is first, the char
is second.
pub fn encode_utf8(&self) -> EncodeUtf8<CharsUtf32<'_>>ⓘNotable traits for EncodeUtf8<I>impl<I> Iterator for EncodeUtf8<I> where
I: Iterator<Item = char>, type Item = u8;
pub fn encode_utf8(&self) -> EncodeUtf8<CharsUtf32<'_>>ⓘNotable traits for EncodeUtf8<I>impl<I> Iterator for EncodeUtf8<I> where
I: Iterator<Item = char>, type Item = u8;
impl<I> Iterator for EncodeUtf8<I> where
I: Iterator<Item = char>, type Item = u8;
Returns an iterator of bytes over the string encoded as UTF-8.
pub fn encode_utf16(&self) -> EncodeUtf16<CharsUtf32<'_>>ⓘNotable traits for EncodeUtf16<I>impl<I> Iterator for EncodeUtf16<I> where
I: Iterator<Item = char>, type Item = u16;
pub fn encode_utf16(&self) -> EncodeUtf16<CharsUtf32<'_>>ⓘNotable traits for EncodeUtf16<I>impl<I> Iterator for EncodeUtf16<I> where
I: Iterator<Item = char>, type Item = u16;
impl<I> Iterator for EncodeUtf16<I> where
I: Iterator<Item = char>, type Item = u16;
Returns an iterator of u16
over the sting encoded as UTF-16.
pub fn escape_debug(&self) -> EscapeDebug<CharsUtf32<'_>>ⓘNotable traits for EscapeDebug<I>impl<I> Iterator for EscapeDebug<I> where
I: Iterator<Item = char>, type Item = char;
pub fn escape_debug(&self) -> EscapeDebug<CharsUtf32<'_>>ⓘNotable traits for EscapeDebug<I>impl<I> Iterator for EscapeDebug<I> where
I: Iterator<Item = char>, type Item = char;
impl<I> Iterator for EscapeDebug<I> where
I: Iterator<Item = char>, type Item = char;
Returns an iterator that escapes each char
in self
with char::escape_debug
.
pub fn escape_default(&self) -> EscapeDefault<CharsUtf32<'_>>ⓘNotable traits for EscapeDefault<I>impl<I> Iterator for EscapeDefault<I> where
I: Iterator<Item = char>, type Item = char;
pub fn escape_default(&self) -> EscapeDefault<CharsUtf32<'_>>ⓘNotable traits for EscapeDefault<I>impl<I> Iterator for EscapeDefault<I> where
I: Iterator<Item = char>, type Item = char;
impl<I> Iterator for EscapeDefault<I> where
I: Iterator<Item = char>, type Item = char;
Returns an iterator that escapes each char
in self
with char::escape_default
.
pub fn escape_unicode(&self) -> EscapeUnicode<CharsUtf32<'_>>ⓘNotable traits for EscapeUnicode<I>impl<I> Iterator for EscapeUnicode<I> where
I: Iterator<Item = char>, type Item = char;
pub fn escape_unicode(&self) -> EscapeUnicode<CharsUtf32<'_>>ⓘNotable traits for EscapeUnicode<I>impl<I> Iterator for EscapeUnicode<I> where
I: Iterator<Item = char>, type Item = char;
impl<I> Iterator for EscapeUnicode<I> where
I: Iterator<Item = char>, type Item = char;
Returns an iterator that escapes each char
in self
with char::escape_unicode
.
Returns the lowercase equivalent of this string slice, as a new Utf32String
.
‘Lowercase’ is defined according to the terms of the Unicode Derived Core Property
Lowercase
.
Since some characters can expand into multiple characters when changing the case, this
function returns a Utf32String
instead of modifying the parameter in-place.
Returns the uppercase equivalent of this string slice, as a new Utf32String
.
‘Uppercase’ is defined according to the terms of the Unicode Derived Core Property
Uppercase
.
Since some characters can expand into multiple characters when changing the case, this
function returns a Utf32String
instead of modifying the parameter in-place.
Trait Implementations
Performs the +=
operation. Read more
Performs the +=
operation. Read more
Mutably borrows from an owned value. Read more
Extends a collection with the contents of an iterator. Read more
extend_one
)Extends a collection with exactly one element.
extend_one
)Reserves capacity in a collection for the given number of additional elements. Read more
Extends a collection with the contents of an iterator. Read more
extend_one
)Extends a collection with exactly one element.
extend_one
)Reserves capacity in a collection for the given number of additional elements. Read more
Creates a value from an iterator. Read more
Creates a value from an iterator. Read more
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
type Owned = Utf32String
type Owned = Utf32String
The resulting type after obtaining ownership.
Creates owned data from borrowed data, usually by cloning. Read more
🔬 This is a nightly-only experimental API. (toowned_clone_into
)
recently added
Uses borrowed data to replace owned data, usually by cloning. Read more