Struct icu_uniset::UnicodeSet[][src]

pub struct UnicodeSet { /* fields omitted */ }
Expand description

A membership wrapper for UnicodeSet.

Provides exposure to membership functions and constructors from serialized UnicodeSets and predefined ranges.

Implementations

Returns UnicodeSet from an inversion list. represented by a Vec<u32> of codepoints.

The inversion list must be of even length, sorted ascending non-overlapping, and within the bounds of 0x0 -> 0x10FFFF inclusive, and end points being exclusive.

Examples

use icu::uniset::UnicodeSet;
use icu::uniset::UnicodeSetError;
let invalid: Vec<u32> = vec![0x0, 0x80, 0x3];
let result = UnicodeSet::from_inversion_list(invalid.clone());
assert!(matches!(result, Err(UnicodeSetError::InvalidSet(_))));
if let Err(UnicodeSetError::InvalidSet(actual)) = result {
    assert_eq!(invalid, actual);
}

Returns an owned inversion list representing the current UnicodeSet

Returns UnicodeSet spanning entire Unicode range

The range spans from 0x0 -> 0x10FFFF inclusive

Returns UnicodeSet spanning BMP range

The range spans from 0x0 -> 0xFFFF inclusive

Yields an Iterator going through the character set in the UnicodeSet

Examples

use icu::uniset::UnicodeSet;
let example_list = vec![0x41, 0x44, 0x45, 0x46];
let example = UnicodeSet::from_inversion_list(example_list).unwrap();
let mut ex_iter_chars = example.iter_chars();
assert_eq!(Some('A'), ex_iter_chars.next());
assert_eq!(Some('B'), ex_iter_chars.next());
assert_eq!(Some('C'), ex_iter_chars.next());
assert_eq!(Some('E'), ex_iter_chars.next());
assert_eq!(None, ex_iter_chars.next());

Yields an Iterator returning the ranges of the code points that are included in the UnicodeSet

Ranges are returned as RangeInclusive, which is inclusive of its end bound value. An end-inclusive behavior matches the ICU4C/J behavior of ranges, ex: UnicodeSet::contains(UChar32 start, UChar32 end).

Example

use icu::uniset::UnicodeSet;
let example_list = vec![0x41, 0x44, 0x45, 0x46];
let example = UnicodeSet::from_inversion_list(example_list).unwrap();
let mut example_iter_ranges = example.iter_ranges();
assert_eq!(Some(0x41..=0x43), example_iter_ranges.next());
assert_eq!(Some(0x45..=0x45), example_iter_ranges.next());
assert_eq!(None, example_iter_ranges.next());

Returns the number of ranges contained in this UnicodeSet

Returns the number of elements of the UnicodeSet

Returns whether or not the UnicodeSet is empty

Checks to see the query is in the UnicodeSet

Runs a binary search in O(log(n)) where n is the number of start and end points in the set using std implementation

Examples

use icu::uniset::UnicodeSet;
let example_list = vec![0x41, 0x43, 0x44, 0x45];
let example = UnicodeSet::from_inversion_list(example_list).unwrap();
assert!(example.contains('A'));
assert!(!example.contains('C'));

Checks to see the unsigned int is in the UnicodeSet::all()

Note: Even though u32 and char in Rust are non-negative 4-byte values, there is an important difference. A u32 can take values up to a very large integer value, while a char in Rust is defined to be in the range from 0 to the maximum valid Unicode Scalar Value.

Runs a binary search in O(log(n)) where n is the number of start and end points in the set using std implementation

Examples

use icu::uniset::UnicodeSet;
let example_list = vec![0x41, 0x43, 0x44, 0x45];
let example = UnicodeSet::from_inversion_list(example_list).unwrap();
assert!(example.contains_u32(0x41));
assert!(!example.contains_u32(0x43));

Checks to see if the range is in the UnicodeSet, returns a Result

Runs a binary search in O(log(n)) where n is the number of start and end points in the set using Vec implementation. Only runs the search once on the start parameter, while the end parameter is checked in a single O(1) step.

Examples

use icu::uniset::UnicodeSet;
let example_list = vec![0x41, 0x43, 0x44, 0x45];
let example = UnicodeSet::from_inversion_list(example_list).unwrap();
assert!(example.contains_range(&('A'..'C')));
assert!(example.contains_range(&('A'..='B')));
assert!(!example.contains_range(&('A'..='C')));

Surrogate points (0xD800 -> 0xDFFF) will return false if the Range contains them but the UnicodeSet does not.

Note: when comparing to ICU4C/J, keep in mind that Ranges in Rust are constructed inclusive of start boundary and exclusive of end boundary. The ICU4C/J UnicodeSet::contains(UChar32 start, UChar32 end) method differs by including the end boundary.

Examples

use icu::uniset::UnicodeSet;
use std::char;
let check = char::from_u32(0xD7FE).unwrap() .. char::from_u32(0xE001).unwrap();
let example_list = vec![0xD7FE, 0xD7FF, 0xE000, 0xE001];
let example = UnicodeSet::from_inversion_list(example_list).unwrap();
assert!(!example.contains_range(&(check)));

Check if the calling UnicodeSet contains all the characters of the given UnicodeSet

Examples

use icu::uniset::UnicodeSet;
let example_list = vec![0x41, 0x46, 0x55, 0x5B]; // A - E, U - Z
let example = UnicodeSet::from_inversion_list(example_list).unwrap();
let a_to_d = UnicodeSet::from_inversion_list(vec![0x41, 0x45]).unwrap();
let f_to_t = UnicodeSet::from_inversion_list(vec![0x46, 0x55]).unwrap();
let r_to_x = UnicodeSet::from_inversion_list(vec![0x52, 0x58]).unwrap();
assert!(example.contains_set(&a_to_d)); // contains all
assert!(!example.contains_set(&f_to_t)); // contains none
assert!(!example.contains_set(&r_to_x)); // contains some

Returns the end of the initial substring where the characters are either contained/not contained in the set.

Examples

use icu::uniset::UnicodeSet;
let example_list = vec![0x41, 0x44]; // {A, B, C}
let example = UnicodeSet::from_inversion_list(example_list).unwrap();
assert_eq!(example.span("CABXYZ", true), 3);
assert_eq!(example.span("XYZC", false), 3);
assert_eq!(example.span("XYZ", true), 0);
assert_eq!(example.span("ABC", false), 0);

Returns the start of the trailing substring (starting from end of string) where the characters are either contained/not contained in the set. Returns the length of the string if no valid return.

Examples

use icu::uniset::UnicodeSet;
let example_list = vec![0x41, 0x44]; // {A, B, C}
let example = UnicodeSet::from_inversion_list(example_list).unwrap();
assert_eq!(example.span_back("XYZCAB", true), 3);
assert_eq!(example.span_back("ABCXYZ", true), 6);
assert_eq!(example.span_back("CABXYZ", false), 3);

Trait Implementations

Returns a copy of the value. Read more

Performs copy-assignment from source. Read more

Formats the value using the given formatter. Read more

Deserialize this value from the given Serde deserializer. Read more

Feeds this value into the given Hasher. Read more

Feeds a slice of this type into the given Hasher. Read more

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

Serialize this value into the given Serde serializer. Read more

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.

This type MUST be Self with the 'static replaced with 'a, i.e. Self<'a>

This method must cast self between &'a Self<'static> and &'a Self<'a>. Read more

This method must cast self between Self<'static> and Self<'a>. Read more

This method can be used to cast away Self<'a>’s lifetime. Read more

This method must cast self between &'a mut Self<'static> and &'a mut Self<'a>, and pass it to f. Read more

Clone the cart C into a [Yokeable] struct, which may retain references into C.

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more

Clone this trait object reference, returning a boxed trait object.

Return this boxed trait object as Box<dyn Any>. Read more

Return this trait object reference as &dyn Any. Read more

Performs the conversion.

Performs the conversion.

The resulting type after obtaining ownership.

Creates owned data from borrowed data, usually by cloning. Read more

🔬 This is a nightly-only experimental API. (toowned_clone_into)

recently added

Uses borrowed data to replace owned data, usually by cloning. Read more

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.