Expand description

A library for parsing the Unicode character database.

Modules

Structs for parsing files in the extracted subdirectory.

Structs

A single row in the DerivedAge.txt file.
Represents a single row in the ArabicShaping.txt file.
Represents a single row in the BidiMirroring.txt file.
A single row in the CaseFolding.txt file.
A single Unicode codepoint.
An iterator over a range of Unicode codepoints.
A range of Unicode codepoints. The range is inclusive; both ends of the range are guaranteed to be valid codepoints.
A single row in the DerivedCoreProperties.txt file.
A single row in the EastAsianWidth.txt file, describing the value of the East_Asian_Width property.
A single row in the emoji-data.txt file.
Represents any kind of error that can occur while parsing the UCD.
A single row in the auxiliary/GraphemeBreakProperty.txt file.
A single row in the auxiliary/GraphemeBreakTest.txt file.
A single row in the Jamo.txt file.
A single row in the auxiliary/LineBreakTest.txt file.
A single row in the NameAliases.txt file.
A single row in the PropList.txt file.
A single row in the PropertyAliases.txt file.
A single row in the PropertyValueAliases.txt file.
A single row in the Scripts.txt file.
A single row in the ScriptExtensions.txt file.
A single row in the auxiliary/SentenceBreakProperty.txt file.
A single row in the auxiliary/SentenceBreakTest.txt file.
A single row in the SpecialCasing.txt file.
A line oriented parser for a particular UCD file.
Represents a single row in the UnicodeData.txt file.
Represents a decomposition mapping of a single row in the UnicodeData.txt file.
An iterator adapter that expands rows in UnicodeData.txt.
A single row in the auxiliary/WordBreakProperty.txt file.
A single row in the auxiliary/WordBreakTest.txt file.

Enums

The status of a particular case mapping.
A representation of either a single codepoint or a range of codepoints.
The kind of error that occurred while parsing the UCD.
The label of a name alias.
The formatting tag on a decomposition mapping.
A numeric value corresponding to characters with Numeric_Type=Numeric.

Traits

Describes a single UCD file.
Describes a single UCD file where every record in the file is associated with one or more codepoints.

Functions

Parse a particular file in the UCD into a sequence of rows.
Parse a particular file in the UCD into a map from codepoint to the record.
Parse a particular file in the UCD into a map from codepoint to all records associated with that codepoint.
Given a path pointing at the root of the ucd_dir, attempts to determine it’s unicode version.