Struct lingua::LanguageDetectorBuilder
source · pub struct LanguageDetectorBuilder { /* private fields */ }
Expand description
This struct configures and creates an instance of LanguageDetector.
Implementations§
source§impl LanguageDetectorBuilder
impl LanguageDetectorBuilder
sourcepub fn from_all_languages() -> Self
pub fn from_all_languages() -> Self
Creates and returns an instance of LanguageDetectorBuilder
with all built-in languages.
sourcepub fn from_all_spoken_languages() -> Self
pub fn from_all_spoken_languages() -> Self
Creates and returns an instance of LanguageDetectorBuilder
with all built-in spoken languages.
sourcepub fn from_all_languages_with_arabic_script() -> Self
pub fn from_all_languages_with_arabic_script() -> Self
Creates and returns an instance of LanguageDetectorBuilder
with all built-in languages supporting the Arabic script.
sourcepub fn from_all_languages_with_cyrillic_script() -> Self
pub fn from_all_languages_with_cyrillic_script() -> Self
Creates and returns an instance of LanguageDetectorBuilder
with all built-in languages supporting the Cyrillic script.
sourcepub fn from_all_languages_with_devanagari_script() -> Self
pub fn from_all_languages_with_devanagari_script() -> Self
Creates and returns an instance of LanguageDetectorBuilder
with all built-in languages supporting the Devanagari script.
sourcepub fn from_all_languages_with_latin_script() -> Self
pub fn from_all_languages_with_latin_script() -> Self
Creates and returns an instance of LanguageDetectorBuilder
with all built-in languages supporting the Latin script.
sourcepub fn from_all_languages_without(languages: &[Language]) -> Self
pub fn from_all_languages_without(languages: &[Language]) -> Self
Creates and returns an instance of LanguageDetectorBuilder
with all built-in languages except those specified in languages
.
⚠ Panics if less than two languages
are used to build the
LanguageDetector
.
sourcepub fn from_languages(languages: &[Language]) -> Self
pub fn from_languages(languages: &[Language]) -> Self
Creates and returns an instance of LanguageDetectorBuilder
with the specified languages
.
⚠ Panics if less than two languages
are specified.
sourcepub fn from_iso_codes_639_1(iso_codes: &[IsoCode639_1]) -> Self
pub fn from_iso_codes_639_1(iso_codes: &[IsoCode639_1]) -> Self
Creates and returns an instance of LanguageDetectorBuilder
with the languages specified by the respective ISO 639-1 codes.
⚠ Panics if less than two iso_codes
are specified.
sourcepub fn from_iso_codes_639_3(iso_codes: &[IsoCode639_3]) -> Self
pub fn from_iso_codes_639_3(iso_codes: &[IsoCode639_3]) -> Self
Creates and returns an instance of LanguageDetectorBuilder
with the languages specified by the respective ISO 639-3 codes.
⚠ Panics if less than two iso_codes
are specified.
sourcepub fn with_minimum_relative_distance(&mut self, distance: f64) -> &mut Self
pub fn with_minimum_relative_distance(&mut self, distance: f64) -> &mut Self
Sets the desired value for the minimum relative distance measure.
By default, Lingua returns the most likely language for a given input text. However, there are certain words that are spelled the same in more than one language. The word prologue, for instance, is both a valid English and French word. Lingua would output either English or French which might be wrong in the given context. For cases like that, it is possible to specify a minimum relative distance that the logarithmized and summed up probabilities for each possible language have to satisfy.
Be aware that the distance between the language probabilities is
dependent on the length of the input text. The longer the input
text, the larger the distance between the languages. So if you
want to classify very short text phrases, do not set the minimum
relative distance too high. Otherwise you will get most results
returned as None
which is the return value for cases
where language detection is not reliably possible.
⚠ Panics if distance
is smaller than 0.0 or greater than 0.99.
sourcepub fn with_preloaded_language_models(&mut self) -> &mut Self
pub fn with_preloaded_language_models(&mut self) -> &mut Self
Configures LanguageDetectorBuilder
to preload all language models when creating
the instance of LanguageDetector.
By default, Lingua uses lazy-loading to load only those language models on demand which are considered relevant by the rule-based filter engine. For web services, for instance, it is rather beneficial to preload all language models into memory to avoid unexpected latency while waiting for the service response. This method allows to switch between these two loading modes.
sourcepub fn with_low_accuracy_mode(&mut self) -> &mut Self
pub fn with_low_accuracy_mode(&mut self) -> &mut Self
Disables the high accuracy mode in order to save memory and increase performance.
By default, Lingua’s high detection accuracy comes at the cost of loading large language models into memory which might not be feasible for systems running low on resources.
This method disables the high accuracy mode so that only a small subset of language models is loaded into memory. The downside of this approach is that detection accuracy for short texts consisting of less than 120 characters will drop significantly. However, detection accuracy for texts which are longer than 120 characters will remain mostly unaffected.
sourcepub fn build(&mut self) -> LanguageDetector
pub fn build(&mut self) -> LanguageDetector
Creates and returns the configured instance of LanguageDetector.
Trait Implementations§
source§impl Clone for LanguageDetectorBuilder
impl Clone for LanguageDetectorBuilder
source§fn clone(&self) -> LanguageDetectorBuilder
fn clone(&self) -> LanguageDetectorBuilder
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read more