Expand description
A library for parsing the Unicode character database.
Modules§
- Types for parsing files in the
extracted
subdirectory of the Unicode Character Database download.
Structs§
- A single row in the
DerivedAge.txt
file. - Represents a single row in the
ArabicShaping.txt
file. - Represents a single row in the
BidiMirroring.txt
file. - A single row in the
CaseFolding.txt
file. - A single Unicode codepoint.
- An iterator over a range of Unicode codepoints.
- A range of Unicode codepoints. The range is inclusive; both ends of the range are guaranteed to be valid codepoints.
- A single row in the
DerivedCoreProperties.txt
file. - A single row in the
DerivedNormalizationProps.txt
file. - A single row in the
EastAsianWidth.txt
file, describing the value of theEast_Asian_Width
property. - A single row in the
emoji-data.txt
file. - Represents any kind of error that can occur while parsing the UCD.
- A single row in the
auxiliary/GraphemeBreakProperty.txt
file. - A single row in the
auxiliary/GraphemeBreakTest.txt
file. - A single row in the
Jamo.txt
file. - A single row in the
auxiliary/LineBreakTest.txt
file. - A single row in the
NameAliases.txt
file. - A single row in the
PropList.txt
file. - A single row in the
PropertyAliases.txt
file. - A single row in the
PropertyValueAliases.txt
file. - A single row in the
Scripts.txt
file. - A single row in the
ScriptExtensions.txt
file. - A single row in the
auxiliary/SentenceBreakProperty.txt
file. - A single row in the
auxiliary/SentenceBreakTest.txt
file. - A single row in the
SpecialCasing.txt
file. - A line oriented parser for a particular UCD file.
- Represents a single row in the
UnicodeData.txt
file. - Represents a decomposition mapping of a single row in the
UnicodeData.txt
file. - An iterator adapter that expands rows in
UnicodeData.txt
. - A single row in the
auxiliary/WordBreakProperty.txt
file. - A single row in the
auxiliary/WordBreakTest.txt
file.
Enums§
- The status of a particular case mapping.
- A representation of either a single codepoint or a range of codepoints.
- The kind of error that occurred while parsing the UCD.
- The label of a name alias.
- The formatting tag on a decomposition mapping.
- A numeric value corresponding to characters with
Numeric_Type=Numeric
.
Traits§
- Describes a single UCD file.
- Describes a single UCD file where every record in the file is associated with one or more codepoints.
Functions§
- Parse a particular file in the UCD into a sequence of rows.
- Parse a particular file in the UCD into a map from codepoint to the record.
- Parse a particular file in the UCD into a map from codepoint to all records associated with that codepoint.
- Given a path pointing at the root of the
ucd_dir
, attempts to determine it’s unicode version.