Struct icu::uniset::UnicodeSet [−][src]
pub struct UnicodeSet { /* fields omitted */ }
Expand description
A membership wrapper for UnicodeSet
.
Provides exposure to membership functions and constructors from serialized UnicodeSets
and predefined ranges.
Implementations
Returns UnicodeSet
from an inversion list.
represented by a Vec
<
u32
>
of codepoints.
The inversion list must be of even length, sorted ascending non-overlapping,
and within the bounds of 0x0 -> 0x10FFFF
inclusive, and end points being exclusive.
Examples
use icu::uniset::UnicodeSet; use icu::uniset::UnicodeSetError; let invalid: Vec<u32> = vec![0x0, 0x80, 0x3]; let result = UnicodeSet::from_inversion_list(invalid.clone()); assert!(matches!(result, Err(UnicodeSetError::InvalidSet(_)))); if let Err(UnicodeSetError::InvalidSet(actual)) = result { assert_eq!(invalid, actual); }
Returns an owned inversion list representing the current UnicodeSet
Returns UnicodeSet
spanning entire Unicode range
The range spans from 0x0 -> 0x10FFFF
inclusive
Returns UnicodeSet
spanning BMP range
The range spans from 0x0 -> 0xFFFF
inclusive
Yields an Iterator
going through the character set in the UnicodeSet
Examples
use icu::uniset::UnicodeSet; let example_list = vec![0x41, 0x44, 0x45, 0x46]; let example = UnicodeSet::from_inversion_list(example_list).unwrap(); let mut ex_iter_chars = example.iter_chars(); assert_eq!(Some('A'), ex_iter_chars.next()); assert_eq!(Some('B'), ex_iter_chars.next()); assert_eq!(Some('C'), ex_iter_chars.next()); assert_eq!(Some('E'), ex_iter_chars.next()); assert_eq!(None, ex_iter_chars.next());
Yields an Iterator
returning the ranges of the code points that are
included in the UnicodeSet
Ranges are returned as RangeInclusive
, which is inclusive of its
end
bound value. An end-inclusive behavior matches the ICU4C/J
behavior of ranges, ex: UnicodeSet::contains(UChar32 start, UChar32 end)
.
Example
use icu::uniset::UnicodeSet; let example_list = vec![0x41, 0x44, 0x45, 0x46]; let example = UnicodeSet::from_inversion_list(example_list).unwrap(); let mut example_iter_ranges = example.iter_ranges(); assert_eq!(Some(0x41..=0x43), example_iter_ranges.next()); assert_eq!(Some(0x45..=0x45), example_iter_ranges.next()); assert_eq!(None, example_iter_ranges.next());
Returns the number of ranges contained in this UnicodeSet
Returns the number of elements of the UnicodeSet
Returns whether or not the UnicodeSet
is empty
Checks to see the query is in the UnicodeSet
Runs a binary search in O(log(n))
where n
is the number of start and end points
in the set using std
implementation
Examples
use icu::uniset::UnicodeSet; let example_list = vec![0x41, 0x43, 0x44, 0x45]; let example = UnicodeSet::from_inversion_list(example_list).unwrap(); assert!(example.contains('A')); assert!(!example.contains('C'));
Checks to see the unsigned int is in the UnicodeSet::all()
Note: Even though u32
and char
in Rust are non-negative 4-byte
values, there is an important difference. A u32
can take values up to
a very large integer value, while a char
in Rust is defined to be in
the range from 0 to the maximum valid Unicode Scalar Value.
Runs a binary search in O(log(n))
where n
is the number of start and end points
in the set using std
implementation
Examples
use icu::uniset::UnicodeSet; let example_list = vec![0x41, 0x43, 0x44, 0x45]; let example = UnicodeSet::from_inversion_list(example_list).unwrap(); assert!(example.contains_u32(0x41)); assert!(!example.contains_u32(0x43));
Checks to see if the range is in the UnicodeSet
, returns a Result
Runs a binary search in O(log(n))
where n
is the number of start and end points
in the set using Vec
implementation. Only runs the search once on the start
parameter, while the end
parameter is checked in a single O(1)
step.
Examples
use icu::uniset::UnicodeSet; let example_list = vec![0x41, 0x43, 0x44, 0x45]; let example = UnicodeSet::from_inversion_list(example_list).unwrap(); assert!(example.contains_range(&('A'..'C'))); assert!(example.contains_range(&('A'..='B'))); assert!(!example.contains_range(&('A'..='C')));
Surrogate points (0xD800 -> 0xDFFF
) will return false
if the Range contains them but the
UnicodeSet
does not.
Note: when comparing to ICU4C/J, keep in mind that Range
s in Rust are
constructed inclusive of start boundary and exclusive of end boundary.
The ICU4C/J UnicodeSet::contains(UChar32 start, UChar32 end)
method
differs by including the end boundary.
Examples
use icu::uniset::UnicodeSet; use std::char; let check = char::from_u32(0xD7FE).unwrap() .. char::from_u32(0xE001).unwrap(); let example_list = vec![0xD7FE, 0xD7FF, 0xE000, 0xE001]; let example = UnicodeSet::from_inversion_list(example_list).unwrap(); assert!(!example.contains_range(&(check)));
Check if the calling UnicodeSet
contains all the characters of the given UnicodeSet
Examples
use icu::uniset::UnicodeSet; let example_list = vec![0x41, 0x46, 0x55, 0x5B]; // A - E, U - Z let example = UnicodeSet::from_inversion_list(example_list).unwrap(); let a_to_d = UnicodeSet::from_inversion_list(vec![0x41, 0x45]).unwrap(); let f_to_t = UnicodeSet::from_inversion_list(vec![0x46, 0x55]).unwrap(); let r_to_x = UnicodeSet::from_inversion_list(vec![0x52, 0x58]).unwrap(); assert!(example.contains_set(&a_to_d)); // contains all assert!(!example.contains_set(&f_to_t)); // contains none assert!(!example.contains_set(&r_to_x)); // contains some
Returns the end of the initial substring where the characters are either contained/not contained in the set.
Examples
use icu::uniset::UnicodeSet; let example_list = vec![0x41, 0x44]; // {A, B, C} let example = UnicodeSet::from_inversion_list(example_list).unwrap(); assert_eq!(example.span("CABXYZ", true), 3); assert_eq!(example.span("XYZC", false), 3); assert_eq!(example.span("XYZ", true), 0); assert_eq!(example.span("ABC", false), 0);
Returns the start of the trailing substring (starting from end of string) where the characters are either contained/not contained in the set. Returns the length of the string if no valid return.
Examples
use icu::uniset::UnicodeSet; let example_list = vec![0x41, 0x44]; // {A, B, C} let example = UnicodeSet::from_inversion_list(example_list).unwrap(); assert_eq!(example.span_back("XYZCAB", true), 3); assert_eq!(example.span_back("ABCXYZ", true), 6); assert_eq!(example.span_back("CABXYZ", false), 3);
Trait Implementations
pub fn deserialize<D>(
deserializer: D
) -> Result<UnicodeSet, <D as Deserializer<'de>>::Error> where
D: Deserializer<'de>,
pub fn deserialize<D>(
deserializer: D
) -> Result<UnicodeSet, <D as Deserializer<'de>>::Error> where
D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
This method tests for self
and other
values to be equal, and is used
by ==
. Read more
This method tests for !=
.
pub fn serialize<S>(
&self,
serializer: S
) -> Result<<S as Serializer>::Ok, <S as Serializer>::Error> where
S: Serializer,
pub fn serialize<S>(
&self,
serializer: S
) -> Result<<S as Serializer>::Ok, <S as Serializer>::Error> where
S: Serializer,
Serialize this value into the given Serde serializer. Read more
type Error = UnicodeSetError
type Error = UnicodeSetError
The type returned in the event of a conversion error.
type Error = UnicodeSetError
type Error = UnicodeSetError
The type returned in the event of a conversion error.
type Error = UnicodeSetError
type Error = UnicodeSetError
The type returned in the event of a conversion error.
pub fn try_from(
&RangeFull
) -> Result<UnicodeSet, <UnicodeSet as TryFrom<&'_ RangeFull>>::Error>
pub fn try_from(
&RangeFull
) -> Result<UnicodeSet, <UnicodeSet as TryFrom<&'_ RangeFull>>::Error>
Performs the conversion.
type Error = UnicodeSetError
type Error = UnicodeSetError
The type returned in the event of a conversion error.
pub fn try_from(
range: &RangeInclusive<char>
) -> Result<UnicodeSet, <UnicodeSet as TryFrom<&'_ RangeInclusive<char>>>::Error>
pub fn try_from(
range: &RangeInclusive<char>
) -> Result<UnicodeSet, <UnicodeSet as TryFrom<&'_ RangeInclusive<char>>>::Error>
Performs the conversion.
type Error = UnicodeSetError
type Error = UnicodeSetError
The type returned in the event of a conversion error.
type Error = UnicodeSetError
type Error = UnicodeSetError
The type returned in the event of a conversion error.
pub fn try_from(
range: &RangeToInclusive<char>
) -> Result<UnicodeSet, <UnicodeSet as TryFrom<&'_ RangeToInclusive<char>>>::Error>
pub fn try_from(
range: &RangeToInclusive<char>
) -> Result<UnicodeSet, <UnicodeSet as TryFrom<&'_ RangeToInclusive<char>>>::Error>
Performs the conversion.
type Error = UnicodeSetError
type Error = UnicodeSetError
The type returned in the event of a conversion error.
pub fn try_into(
self
) -> Result<UnicodeSet, <UnicodePropertyV1<'data> as TryInto<UnicodeSet>>::Error>
pub fn try_into(
self
) -> Result<UnicodeSet, <UnicodePropertyV1<'data> as TryInto<UnicodeSet>>::Error>
Performs the conversion.
type Output = UnicodeSet
type Output = UnicodeSet
This type MUST be Self
with the 'static
replaced with 'a
, i.e. Self<'a>
This method must cast self
between &'a Self<'static>
and &'a Self<'a>
. Read more
This method must cast self
between Self<'static>
and Self<'a>
. Read more
This method can be used to cast away Self<'a>
’s lifetime. Read more
pub fn transform_mut<F>(&'a mut self, f: F) where
F: 'static + for<'b> FnOnce(&'b mut <UnicodeSet as Yokeable<'a>>::Output),
pub fn transform_mut<F>(&'a mut self, f: F) where
F: 'static + for<'b> FnOnce(&'b mut <UnicodeSet as Yokeable<'a>>::Output),
This method must cast self
between &'a mut Self<'static>
and &'a mut Self<'a>
,
and pass it to f
. Read more
Clone the cart C
into a [Yokeable
] struct, which may retain references into C
.
Auto Trait Implementations
impl RefUnwindSafe for UnicodeSet
impl Send for UnicodeSet
impl Sync for UnicodeSet
impl Unpin for UnicodeSet
impl UnwindSafe for UnicodeSet
Blanket Implementations
Mutably borrows from an owned value. Read more
pub fn clone_into_box(&self) -> Box<dyn ErasedDataStruct + 'static, Global>
pub fn clone_into_box(&self) -> Box<dyn ErasedDataStruct + 'static, Global>
Clone this trait object reference, returning a boxed trait object.