AutoCorrrect for Rust

Automatically add whitespace between CJK (Chinese, Japanese, Korean) and half-width characters (alphabetical letters, numerical digits and symbols).
Other implements
Features
- Auto add spacings between CJK (Chinese, Japanese, Korean) and English words.
- HTML content support.
- Fullwidth -> halfwidth (only for [a-zA-Z0-9], and
: in time).
- Correct punctuations into Fullwidth near the CJK.
Install
In your Cargo.toml
[dependencies]
autocorrect = "0.4.0"
Usage
Use autocorrect::format to format plain text.
extern crate autocorrect;
fn main() {
println!("{}", autocorrect::format("长桥LongBridge App下载"));
println!("{}", autocorrect::format("Ruby 2.7版本第1次发布"));
println!("{}", autocorrect::format("于3月10日开始"));
println!("{}", autocorrect::format("包装日期为2013年3月10日"));
println!("{}", autocorrect::format("全世界已有数百家公司在生产环境中使用Rust,以达到快速、跨平台、低资源占用的目的。"));
println!("{}", autocorrect::format("既に、世界中の数百という企業がRustを採用し、高速で低リソースのクロスプラットフォームソリューションを実現しています。"));
println!("{}", autocorrect::format("전 세계 수백 개의 회사가 프로덕션 환경에서 Rust를 사용하여 빠르고, 크로스 플랫폼 및 낮은 리소스 사용량을 달성했습니다."));
println!("{}", autocorrect::format("需要符号?自动转换全角字符、数字:我们将在16:32分出发去CBD中心.")
}
Use autocorrect::format_html to format html content.
extern crate autocorrect;
fn main() {
let html = r#"
<article>
<h1>这是Heading标题</h1>
<div class="content">
<p>你好Rust世界<strong>Bold文本</strong></p>
<p>这是第二行p标签</p>
</div>
</article>
"#;
println!("{}", autocorrect::format_html(html));
}
Benchmark
Format
Use cargo bench to run benchmark tests.
test tests::bench_format_100 ... bench: 19,410 ns/iter (+/- 1,571)
test tests::bench_format_400 ... bench: 45,957 ns/iter (+/- 3,444)
test tests::bench_format_50 ... bench: 14,538 ns/iter (+/- 1,555)
| Total chars |
Duration |
| 50 |
0.014 ms |
| 100 |
0.019 ms |
| 400 |
0.045 ms |
FormatHTML
TODO
License
This project under MIT license.