Crate whatlang

source ·
Expand description

Whatlang is a Rust library to detect(regonize) natural languages. Apart from it, the library also recognizes scripts (writing system). Every language and script are represented by determined list of enums.


Using detect function:

use whatlang::{detect, Lang, Script};

let text = "Ĉu vi ne volas eklerni Esperanton? Bonvolu! Estas unu de la plej bonaj aferoj!";
let info = detect(text).unwrap();
assert_eq!(info.lang(), Lang::Epo);
assert_eq!(info.script(), Script::Latin);

// Confidence is in the range from 0 to 1.
assert_eq!(info.confidence(), 1.0);

Using Detector with specified denylist or allowlist:

use whatlang::{Detector, Lang};

let allowlist = vec![Lang::Eng, Lang::Rus];

// You can also create detector using with_denylist function
let detector = Detector::with_allowlist(allowlist);
let lang = detector.detect_lang("There is no reason not to learn Esperanto.");
assert_eq!(lang, Some(Lang::Eng));


enum-mapLang and Script implement Enum trait from enum-map
arbitrarySupport Arbitrary
serdeImplements Serialize and Deserialize for Lang and Script
devEnables whatlang::dev module which provides some internal API.
It exists for profiling purposes and normal users are discouraged to to rely on this API.


  • Configurable structure that holds detection options and provides functions to detect language and script.
  • Represents a full outcome of language detection.


  • Represents a language following ISO 639-3 standard.
  • Represents a writing system (Latin, Cyrillic, Arabic, etc).


  • Detect a language and a script by a given text.
  • Detect only a language by a given text.
  • Detect only a script by a given text. Works much faster than a complete detection with detect.