[−][src]Crate chinese_dictionary
About
A searchable Chinese / English dictionary with helpful utilities.
Features
- Search with Traditional Chinese characters, Simplified Chinese characters, pinyin with tone marks, pinyin with tone numbers, pinyin with no tones, and English.
- Classify a string of text as either English, pinyin, or Chinese characters.
- Convert between Traditional and Simplified Chinese characters.
- Segment strings of Chinese characters into tokens using a dictionary-driven segmentation approach.
Usage
Querying the dictionary
extern crate chinese_dictionary; use chinese_dictionary::ChineseDictionary; let dictionary = ChineseDictionary::new(); // Instantiation may take a while // Querying the dictionary returns an `Option<Vec<&WordEntry>>` // Read more about the WordEntry struct below let query = "to run"; let results = dictionary.query(query).unwrap(); let first_result = results.first().unwrap(); println!("{}", first_result.simplified) // --> "执行"
Classifying a string of text
extern crate chinese_dictionary; use chinese_dictionary::ChineseDictionary; use chinese_dictionary::ClassificationResult; let dictionary = ChineseDictionary::new(); // Instantiation may take a while // Read more about the ClassificationResult enum below println!("{:?}", dictionary.classify("nihao")); // --> ClassificationResult::PY
Convert between Traditional and Simplified Chinese characters
extern crate chinese_dictionary; use chinese_dictionary::ChineseDictionary; let dictionary = ChineseDictionary::new(); // Instantiation may take a while println!("{}", dictionary.convert_to_simplified("簡體字")); // --> "简体字" println!("{}", dictionary.convert_to_traditional("繁体字")); // --> "繁體字"
Segment a string of characters
extern crate chinese_dictionary; use chinese_dictionary::ChineseDictionary; let dictionary = ChineseDictionary::new(); // Instantiation may take a while println!("{:?}", dictionary.segment("今天天气不错")); // --> ["今天", "天气", "不错"]
WordEntry
struct
extern crate chinese_dictionary; use chinese_dictionary::WordEntry; use chinese_dictionary::MeasureWord; let example_measure_word = MeasureWord { traditional: "example_traditional".to_string(), simplified: "example_simplified".to_string(), pinyin_marks: "example_pinyin_marks".to_string(), pinyin_numbers: "example_pinyin_numbers".to_string(), }; let example = WordEntry { traditional: "繁體字".to_string(), simplified: "繁体字".to_string(), pinyin_marks: "fán tǐ zì".to_string(), pinyin_numbers: "fan2 ti3 zi4".to_string(), english: vec!["traditional Chinese character".to_string()], tone_marks: vec![2 as u8, 3 as u8, 4 as u8], hash: 000000 as u64, measure_words: vec![example_measure_word], hsk: 6 as u8, word_id: 11111111 as u32, };
ClassificationResult
enum
The possible values for the ClassificationResult
enum are:
PY
: Represents PinyinEN
: Represents EnglishZH
: Represents ChineseUN
: Represents an uncertain classification result
Structs
ChineseDictionary | |
MeasureWord | |
WordEntry |
Enums
ClassificationResult |