chinese_dictionary
About
A searchable Chinese / English dictionary with helpful utilities.
Features
- Search with Traditional Chinese characters, Simplified Chinese characters, pinyin with tone marks, pinyin with tone numbers, pinyin with no tones, and English.
- Classify a string of text as either English, Pinyin, or Chinese characters.
- Convert between Traditional and Simplified Chinese characters.
- Segment strings of Chinese characters into tokens using a dictionary-driven segmentation approach.
Usage
Querying the dictionary
extern crate chinese_dictionary;
use ChineseDictionary;
let dictionary = new; // Instantiation may take a while
// Querying the dictionary returns an `Option<Vec<&WordEntry>>`
// Read more about the WordEntry struct below
let query = "to run";
let results = dictionary.query.unwrap;
let first_result = results.first.unwrap;
println! // --> "执行"
Classifying a string of text
extern crate chinese_dictionary;
use ChineseDictionary;
use ClassificationResult;
let dictionary = new; // Instantiation may take a while
// Read more about the ClassificationResult enum below
println!; // --> ClassificationResult::PY
Convert between Traditional and Simplified Chinese characters
extern crate chinese_dictionary;
use ChineseDictionary;
let dictionary = new; // Instantiation may take a while
println!; // --> "简体字"
println!; // --> "繁體字"
Segment a string of characters
extern crate chinese_dictionary;
use ChineseDictionary;
let dictionary = new; // Instantiation may take a while
println!; // --> ["今天", "天气", "不错"]
WordEntry
struct
extern crate chinese_dictionary;
use WordEntry;
use MeasureWord;
let example_measure_word = MeasureWord ;
let example = WordEntry ;
ClassificationResult
enum
The possible values for the ClassificationResult
enum are:
PY
: Represents PinyinEN
: Represents EnglishZH
: Represents ChineseUN
: Represents an uncertain classification result
License
This software is licensed under the MIT License.
This project uses data from the CC-CEDICT, licensed under the Creative Commons Attribute-Share Alike 4.0 License. This data has been formatted to work with this project. The .dictionary
files within the data/
directory are licensed under the Creative Commons Attribute-Share Alike 4.0 License.