truecase.rs
truecase.rs
is a simple statistical truecaser written in Rust.
Truecasing is restoration of original letter cases in text: for example, turning all-uppercase, or all-lowercase text into one that has proper sentence casing (capital first letter, capitalized names etc).
This crate attempts to solve this problem by gathering statistics from a set of training sentences, then using those statistics to restore correct casings in broken sentences. It comes with a command-line utility that makes training the statistical model easy.
Quick usage example
use ;
// build a statistical model from sample sentences
let mut trainer = new;
trainer.add_sentence;
trainer.add_sentence;
trainer.add_sentence;
let model = trainer.into_model;
// use gathered statistics to restore case in caseless text
let truecased_text = model.truecase;
assert_eq!;
See documentation for more details.
License
truecase.rs is licensed under the terms of the MIT License or the Apache License 2.0, at your choosing.