Ferrous-opencc
A pure Rust implementation of the OpenCC project, dedicated to providing high-performance and reliable conversion between Traditional and Simplified Chinese.
Features
- High-Performance: Utilizes
FST(Finite State Transducers) for efficient dictionary lookups, significantly outperforming HashMap-based implementations. - Pure Rust: No C++ dependencies. Implemented entirely in Rust.
- Extensible: Supports loading custom OpenCC configuration files and dictionaries.
- Comprehensive Tooling: Includes a command-line tool to compile text dictionaries into an efficient
.ocbbinary format.
Quick Start
Add ferrous-opencc to your Cargo.toml:
[]
= "*"
Directory Structure
This library loads dictionaries and configuration files from the local filesystem. You can use the complete set of dictionary files I've prepared, or compile your own and place them in the assets/dictionaries/ folder.
your-project/
├── assets/
│ ├── dictionaries/
│ │ ├── STPhrases.txt
│ │ ├── STCharacters.txt
│ │ ├── TPhrases.txt
│ │ └── ... (other .txt dictionary files)
│ └── s2t.json
└── src/
└── main.rs
You can obtain these dictionary and configuration files from the official OpenCC repository.
Example
A basic example of converting Simplified Chinese to Traditional Chinese.
use ;
Command-Line Tool
This library provides a dictionary compilation tool. You can install it by enabling the compiler-tools feature.
Then, you can compile text dictionaries into the binary .ocb format:
This will generate an STCharacters.ocb file in the same directory. The library will automatically use these .ocb files as a cache to speed up initial loading.
License
This project is licensed under the Apache-2.0 license.