Whatlang
Natural language detection for Rust. Documentation.
Features
- Supports 84 languages
- 100% written in Rust
- No external dependencies
- Fast
- Recognizes not only a language, but also a script (Latin, Cyrillic, etc)
Get started
The library is still in active development. Here is the short example how to use it:
Add to you Cargo.toml
:
[dependencies]
whatlang = "0.3.0"
Small example:
use ;
// Detect Esperanto (there are also `detect_lang` and `detect_script` functions)
let info = detect.unwrap;
assert_eq!;
assert_eq!;
Blacklisting and whitelisting
You can create configured detector to apply blacklist or whitelist:
use ;
const WHITELIST : &'static = &;
// You can also create detector using `with_blacklist` function
let detector = with_whitelist;
// There are also `detect` and `detect_script` functions
let lang = detector.detect_lang;
assert_eq!;
For more details, please check documentation.
Running benchmarks
cargo bench
Roadmap
Support about 100 languages (actually at the moment it's 84)Allow to specify blacklist for QueryAllow to specify whitelist for QuerySupport new APIWrite doc for public structures and functionsImprove README exampleImplement benchmarksTune performanceCreate examples- Provide some metrics about reliability(confidence) in
Info
struct
License
MIT
Acknowledgments
- Thanks Franc JS for trigrams dataset.