Expand description

githubcrates-iodocs-rs


This library provides a way to search for Unicode code point intervals by categories, ranges, and custom character sets.

The main purpose of unicode-intervals is to simplify generating strings that matching specific criteria.

Examples

Raw Unicode codepoint intervals from the latest Unicode version:

use unicode_intervals::UnicodeCategory;

let intervals = unicode_intervals::query()
    .include_categories(
        UnicodeCategory::UPPERCASE_LETTER |
        UnicodeCategory::LOWERCASE_LETTER
    )
    .max_codepoint(128)
    .include_characters("☃")
    .intervals()
    .expect("Invalid query input");
assert_eq!(intervals, &[(65, 90), (97, 122), (9731, 9731)]);

IntervalSet for index-like access to the underlying codepoints:

use unicode_intervals::UnicodeCategory;

let interval_set = unicode_intervals::query()
    .include_categories(UnicodeCategory::UPPERCASE_LETTER)
    .interval_set()
    .expect("Invalid query input");
// Get 10th codepoint in this interval set
assert_eq!(interval_set.codepoint_at(10), Some('K' as u32));
assert_eq!(interval_set.index_of('K'), Some(10));

Query specific Unicode version:

use unicode_intervals::UnicodeVersion;

let intervals = UnicodeVersion::V11_0_0.query()
    .max_codepoint(128)
    .include_characters("☃")
    .intervals()
    .expect("Invalid query input");
assert_eq!(intervals, &[(0, 128), (9731, 9731)]);

Restrict the output to code points within a certain range:

let intervals = unicode_intervals::query()
    .min_codepoint(65)
    .max_codepoint(128)
    .intervals()
    .expect("Invalid query input");
assert_eq!(intervals, &[(65, 128)])

Include or exclude specific characters:

let intervals = unicode_intervals::query()
    .include_categories(UnicodeCategory::PARAGRAPH_SEPARATOR)
    .include_characters("-123")
    .intervals()
    .expect("Invalid query input");
assert_eq!(intervals, &[(45, 45), (49, 51), (8233, 8233)])

Unicode version support

unicode-intervals supports Unicode 9.0.0 - 15.0.0.

Structs

  • A Query builder for specifying the input parameters to the intervals() method in UnicodeVersion.
  • A collection of non-overlapping Unicode codepoint intervals that enables interval-based operations, such as iteration over all Unicode codepoints or finding the codepoint at a specific position within the intervals.
  • Set of Unicode categories.

Enums

Functions

  • Build a query that finds Unicode intervals matching the query criteria.

Type Definitions

  • Interval between two Unicode codepoints.