Skip to main content

Module lm

Module lm

Expand description

Language models.

This module provides n-gram language models that can be trained on tokenized text and used to score and generate word sequences.

§Example

use rustling::lm::MLE;
use rustling::lm::BaseLanguageModel;

// Create a bigram MLE language model
let mut model = MLE::new(2).unwrap();
model.fit(vec![
    vec!["the".into(), "cat".into(), "sat".into()],
    vec!["the".into(), "dog".into(), "ran".into()],
]);
let score = model.score("cat".into(), Some(vec!["the".into()])).unwrap();
assert!((score - 0.5).abs() < 1e-9);

Structs§

Laplace: Laplace (add-one) smoothing language model (Lidstone with gamma=1).
Lidstone: Lidstone (additive) smoothing language model.
MLE: Maximum Likelihood Estimation language model (no smoothing).
Vocabulary: A vocabulary of known words, with OOV mapping to <UNK>.

Enums§

Smoothing: Smoothing method for language model probability estimation.

Constants§

BOS_LABEL
EOS_LABEL
UNK_LABEL

Traits§

BaseLanguageModel: Core language model behavior with default implementations.