Crate icu_uniset[][src]

icu_uniset is one of the ICU4X components.

This API provides necessary functionality for highly efficient querying of sets of Unicode characters.

It is an implementation of the existing ICU4C UnicodeSet API.

Architecture

ICU4X UnicodeSet is split up into independent levels, with UnicodeSet representing the membership/query API, and UnicodeSetBuilder representing the builder API. A Properties API is in future works.

Examples:

Creating a UnicodeSet

UnicodeSets are created from either serialized UnicodeSets, represented by inversion lists, the UnicodeSetBuilder, or from the TBA Properties API.

use icu::uniset::{UnicodeSet, UnicodeSetBuilder};

let mut builder = UnicodeSetBuilder::new();
builder.add_range(&('A'..'Z'));
let set: UnicodeSet = builder.build();

assert!(set.contains('A'));

Querying a UnicodeSet

Currently, you can check if a character/range of characters exists in the UnicodeSet, or iterate through the characters.

use icu::uniset::{UnicodeSet, UnicodeSetBuilder};

let mut builder = UnicodeSetBuilder::new();
builder.add_range(&('A'..'Z'));
let set: UnicodeSet = builder.build();

assert!(set.contains('A'));
assert!(set.contains_range(&('A'..='C')));
assert_eq!(set.iter_chars().next(), Some('A'));

Modules

enum_props
fmt

Utilities for formatting and printing Strings.

props

*** Note: DO NOT USE THESE APIs FOR NOW. **** Performance improvements and other fixes are still needed on this component.

provider

Data provider struct definitions for this ICU4X component.

Structs

UnicodeSet

A membership wrapper for UnicodeSet.

UnicodeSetBuilder

A builder for UnicodeSet.

Enums

UnicodeSetError

Custom Errors for UnicodeSet.

UnicodeSetSpanCondition

Functions

deconstruct_range

Returns start (inclusive) and end (exclusive) bounds of RangeBounds

is_valid

Returns whether the vector is sorted ascending non inclusive, of even length, and within the bounds of 0x0 -> 0x10FFFF inclusive.