Crate cld2

Source
Expand description

DEPRECATED in favor of whatlang, which is native Rust and smaller. If you have a compelling use-case for this code, please open an issue.

Detect the language of a string using the cld2 library from the Chromium project.

use cld2::{detect_language, Format, Reliable, Lang};

let text = "It is an ancient Mariner,
And he stoppeth one of three.
'By thy long grey beard and glittering eye,
Now wherefore stopp'st thou me?";

assert_eq!((Some(Lang("en")), Reliable),
           detect_language(text, Format::Text));

This library wraps the cld2-sys library, which provides a raw interface to cld2. The only major feature which isn’t yet wrapped is the ResultChunk interface, because it tends to give fairly imprecise answers—it wouldn’t make a very good multi-lingual spellchecker component, for example. As always, pull requests are eagerly welcome!

WARNING: We assume that nobody tries to change the loaded cld2 data tables or calls the C++ function CLD2::DetectLanguageVersion behind our backs. These configuration and debugging APIs in cld2 are not thread safe.

For more information, see the GitHub project for this library.

Re-exports§

pub use self::Reliability::Reliable;
pub use self::Reliability::Unreliable;

Structs§

DetectionResult
Detailed language detection results.
Hints
Hints to the decoder, which it will use to make better guesses.
Lang
A language code, normally two letters for common languages.
LanguageScore
Detailed information about how well the input text matched a specific language.

Enums§

Format
Possible data formats.
Reliability
Is the output of the language decoder reliable?

Functions§

detect_language
Detect the language of the input text.
detect_language_ext
Detect the language of the input text, using optional hints, and return detailed statistics.
detector_version
Get the version of cld2 and its embedded data files as a string.