Enum unic_ucd::SentenceBreak
[−]
[src]
pub enum SentenceBreak { CR, LF, Extend, Sep, Format, Sp, Lower, Upper, OLetter, Numeric, ATerm, SContinue, STerm, Close, Other, }
Represents the Unicode character Sentence_Break property.
References
Variants
CR
U+000D CARRIAGE RETURN (CR)
LF
U+000A LINE FEED (LF)
Extend
Grapheme_Extend = Yes, or
U+200D ZERO WIDTH JOINER (ZWJ), or
General_Category = Spacing_Mark
Sep
U+0085 NEXT LINE (NEL)
U+2028 LINE SEPARATOR
U+2029 PARAGRAPH SEPARATOR
Format
General_Category = Format
and not U+200C ZERO WIDTH NON-JOINER (ZWNJ)
and not U+200D ZERO WIDTH JOINER (ZWJ)
Sp
White_Space = Yes
and Sentence_Break ≠ Sep
and Sentence_Break ≠ CR
and Sentence_Break ≠ LF
Lower
Lowercase = Yes
and Grapheme_Extend = No
Upper
General_Category = Titlecase_Letter, or
Uppercase = Yes
OLetter
Alphabetic = Yes, or
U+00A0 NO-BREAK SPACE (NBSP), or
U+05F3 ( ׳ ) HEBREW PUNCTUATION GERESH
and Lower = No
and Upper = No
and Sentence_Break ≠ Extend
Numeric
Line_Break = Numeric
ATerm
U+002E ( . ) FULL STOP
U+2024 ( ․ ) ONE DOT LEADER
U+FE52 ( ﹒ ) SMALL FULL STOP
U+FF0E ( . ) FULLWIDTH FULL STOP
SContinue
U+002C ( , ) COMMA
U+002D ( - ) HYPHEN-MINUS
U+003A ( : ) COLON
U+055D ( ՝ ) ARMENIAN COMMA
U+060C ( ، ) ARABIC COMMA
U+060D ( ؍ ) ARABIC DATE SEPARATOR
U+07F8 ( ߸ ) NKO COMMA
U+1802 ( ᠂ ) MONGOLIAN COMMA
U+1808 ( ᠈ ) MONGOLIAN MANCHU COMMA
U+2013 ( – ) EN DASH
U+2014 ( — ) EM DASH
U+3001 ( 、 ) IDEOGRAPHIC COMMA
U+FE10 ( ︐ ) PRESENTATION FORM FOR VERTICAL COMMA
U+FE11 ( ︑ ) PRESENTATION FORM FOR VERTICAL IDEOGRAPHIC COMMA
U+FE13 ( ︓ ) PRESENTATION FORM FOR VERTICAL COLON
U+FE31 ( ︱ ) PRESENTATION FORM FOR VERTICAL EM DASH
U+FE32 ( ︲ ) PRESENTATION FORM FOR VERTICAL EN DASH
U+FE50 ( ﹐ ) SMALL COMMA
U+FE51 ( ﹑ ) SMALL IDEOGRAPHIC COMMA
U+FE55 ( ﹕ ) SMALL COLON
U+FE58 ( ﹘ ) SMALL EM DASH
U+FE63 ( ﹣ ) SMALL HYPHEN-MINUS
U+FF0C ( , ) FULLWIDTH COMMA
U+FF0D ( - ) FULLWIDTH HYPHEN-MINUS
U+FF1A ( : ) FULLWIDTH COLON
U+FF64 ( 、 ) HALFWIDTH IDEOGRAPHIC COMMA
STerm
Sentence_Terminal = Yes
Close
General_Category = Open_Punctuation, or
General_Category = Close_Punctuation, or
Line_Break = Quotation
and not U+05F3 ( ׳ ) HEBREW PUNCTUATION GERESH
and ATerm = No
and STerm = No
Other
All other characters
Methods
impl SentenceBreak
[src]
fn of(ch: char) -> SentenceBreak
[src]
Find the character Sentence_Break property value.