Crate ucd_parse

Source

Expand description

A library for parsing the Unicode character database.

Modules§

extracted: Types for parsing files in the extracted subdirectory of the Unicode Character Database download.

Structs§

Age: A single row in the DerivedAge.txt file.
ArabicShaping: Represents a single row in the ArabicShaping.txt file.
BidiMirroring: Represents a single row in the BidiMirroring.txt file.
CaseFold: A single row in the CaseFolding.txt file.
Codepoint: A single Unicode codepoint.
CodepointIter: An iterator over a range of Unicode codepoints.
CodepointRange: A range of Unicode codepoints. The range is inclusive; both ends of the range are guaranteed to be valid codepoints.
CoreProperty: A single row in the DerivedCoreProperties.txt file.
DerivedNormalizationProperty: A single row in the DerivedNormalizationProps.txt file.
EastAsianWidth: A single row in the EastAsianWidth.txt file, describing the value of the East_Asian_Width property.
EmojiProperty: A single row in the emoji-data.txt file.
Error: Represents any kind of error that can occur while parsing the UCD.
GraphemeClusterBreak: A single row in the auxiliary/GraphemeBreakProperty.txt file.
GraphemeClusterBreakTest: A single row in the auxiliary/GraphemeBreakTest.txt file.
JamoShortName: A single row in the Jamo.txt file.
LineBreakTest: A single row in the auxiliary/LineBreakTest.txt file.
NameAlias: A single row in the NameAliases.txt file.
Property: A single row in the PropList.txt file.
PropertyAlias: A single row in the PropertyAliases.txt file.
PropertyValueAlias: A single row in the PropertyValueAliases.txt file.
Script: A single row in the Scripts.txt file.
ScriptExtension: A single row in the ScriptExtensions.txt file.
SentenceBreak: A single row in the auxiliary/SentenceBreakProperty.txt file.
SentenceBreakTest: A single row in the auxiliary/SentenceBreakTest.txt file.
SpecialCaseMapping: A single row in the SpecialCasing.txt file.
UcdLineParser: A line oriented parser for a particular UCD file.
UnicodeData: Represents a single row in the UnicodeData.txt file.
UnicodeDataDecomposition: Represents a decomposition mapping of a single row in the UnicodeData.txt file.
UnicodeDataExpander: An iterator adapter that expands rows in UnicodeData.txt.
WordBreak: A single row in the auxiliary/WordBreakProperty.txt file.
WordBreakTest: A single row in the auxiliary/WordBreakTest.txt file.

Enums§

CaseStatus: The status of a particular case mapping.
Codepoints: A representation of either a single codepoint or a range of codepoints.
ErrorKind: The kind of error that occurred while parsing the UCD.
NameAliasLabel: The label of a name alias.
UnicodeDataDecompositionTag: The formatting tag on a decomposition mapping.
UnicodeDataNumeric: A numeric value corresponding to characters with Numeric_Type=Numeric.

Traits§

UcdFile: Describes a single UCD file.
UcdFileByCodepoint: Describes a single UCD file where every record in the file is associated with one or more codepoints.

Functions§

parse: Parse a particular file in the UCD into a sequence of rows.
parse_by_codepoint: Parse a particular file in the UCD into a map from codepoint to the record.
parse_many_by_codepoint: Parse a particular file in the UCD into a map from codepoint to all records associated with that codepoint.
ucd_directory_version: Given a path pointing at the root of the ucd_dir, attempts to determine it’s unicode version.

Crate ucd_parseCopy item path

Modules§

Structs§

Enums§

Traits§

Functions§

Crate ucd_parse