pinyin-parser-rs
Parses a string of pinyin syllables. Covers marginal cases such as ẑ
, ŋ
and ê
.
Since pinyin strings in the wild does not necessarily conform to the standard, this parser offers two modes: strict and loose.
Strict mode:
- forbids the use of breve instead of hacek to represent the third tone
- forbids the use of IPA
ɡ
(U+0261) instead ofg
, and other such lookalike characters - allows apostrophes only before an
a
, ane
or ano
Examples
use PinyinParser;
assert_eq!;
The resulting strings are NFC-normalized (i.e. the sample above gives a single-character ī
U+012B)
Erhua is supported.
use PinyinParser;
assert_eq!;
If you want r
to be separated from the main syllable, use .split_erhua()
.
Note that syllables "er", "ēr", "ér", "ěr", and "èr" are exempt from this splitting.
use PinyinParser;
assert_eq!;
This parser supports the use of ẑ
, ĉ
, ŝ
and ŋ
, though I have never seen anyone use it.
use PinyinParser;
assert_eq!
use PinyinParser;
assert_eq!;