[][src]Crate ucd_parse

A library for parsing the Unicode character database.

Structs

Age

A single row in the DerivedAge.txt file.

ArabicShaping

Represents a single row in the ArabicShaping.txt file.

BidiMirroring

Represents a single row in the BidiMirroring.txt file.

CaseFold

A single row in the CaseFolding.txt file.

Codepoint

A single Unicode codepoint.

CodepointIter

An iterator over a range of Unicode codepoints.

CodepointRange

A range of Unicode codepoints. The range is inclusive; both ends of the range are guaranteed to be valid codepoints.

CoreProperty

A single row in the DerivedCoreProperties.txt file.

EmojiProperty

A single row in the emoji-data.txt file.

Error

Represents any kind of error that can occur while parsing the UCD.

GraphemeClusterBreak

A single row in the auxiliary/GraphemeBreakProperty.txt file.

GraphemeClusterBreakTest

A single row in the auxiliary/GraphemeBreakTest.txt file.

JamoShortName

A single row in the Jamo.txt file.

LineBreakTest

A single row in the auxiliary/LineBreakTest.txt file.

NameAlias

A single row in the NameAliases.txt file.

Property

A single row in the PropList.txt file.

PropertyAlias

A single row in the PropertyAliases.txt file.

PropertyValueAlias

A single row in the PropertyValueAliases.txt file.

Script

A single row in the Scripts.txt file.

ScriptExtension

A single row in the ScriptExtensions.txt file.

SentenceBreak

A single row in the auxiliary/SentenceBreakProperty.txt file.

SentenceBreakTest

A single row in the auxiliary/SentenceBreakTest.txt file.

SpecialCaseMapping

A single row in the SpecialCasing.txt file.

UcdLineParser

A line oriented parser for a particular UCD file.

UnicodeData

Represents a single row in the UnicodeData.txt file.

UnicodeDataDecomposition

Represents a decomposition mapping of a single row in the UnicodeData.txt file.

UnicodeDataExpander

An iterator adapter that expands rows in UnicodeData.txt.

WordBreak

A single row in the auxiliary/WordBreakProperty.txt file.

WordBreakTest

A single row in the auxiliary/WordBreakTest.txt file.

Enums

CaseStatus

The status of a particular case mapping.

Codepoints

A representation of either a single codepoint or a range of codepoints.

ErrorKind

The kind of error that occurred while parsing the UCD.

NameAliasLabel

The label of a name alias.

UnicodeDataDecompositionTag

The formatting tag on a decomposition mapping.

UnicodeDataNumeric

A numeric value corresponding to characters with Numeric_Type=Numeric.

Traits

UcdFile

Describes a single UCD file.

UcdFileByCodepoint

Describes a single UCD file where every record in the file is associated with one or more codepoints.

Functions

parse

Parse a particular file in the UCD into a sequence of rows.

parse_by_codepoint

Parse a particular file in the UCD into a map from codepoint to the record.

parse_many_by_codepoint

Parse a particular file in the UCD into a map from codepoint to all records associated with that codepoint.

ucd_directory_version

Given a path pointing at the root of the ucd_dir, attempts to determine it's unicode version.