Crate rustfst

source ·
Expand description

Rustfst

Rust implementation of Weighted Finite States Transducers.

Rustfst is a library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Weighted finite-state transducers are automata where each transition has an input label, an output label, and a weight. The more familiar finite-state acceptor is represented as a transducer with each transition’s input and output label equal. Finite-state acceptors are used to represent sets of strings (specifically, regular or rational sets); finite-state transducers are used to represent binary relations between pairs of strings (specifically, rational transductions). The weights can be used to represent the cost of taking a particular transition.

FSTs have key applications in speech recognition and synthesis, machine translation, optical character recognition, pattern matching, string processing, machine learning, information extraction and retrieval among others. Often a weighted transducer is used to represent a probabilistic model (e.g., an n-gram model, pronunciation model). FSTs can be optimized by determinization and minimization, models can be applied to hypothesis sets (also represented as automata) or cascaded by finite-state composition, and the best results can be selected by shortest-path algorithms.

References

Implementation heavily inspired from Mehryar Mohri’s, Cyril Alluzen’s and Michael Riley’s work :

Example

use rustfst::utils::transducer;
use rustfst::semirings::{Semiring, IntegerWeight};
use rustfst::fst_impls::VectorFst;
use rustfst::fst_traits::{MutableFst, PathsIterator};
use rustfst::arc::Arc;

// Creates a empty wFST
let mut fst = VectorFst::new();

// Add some states
let s0 = fst.add_state();
let s1 = fst.add_state();
let s2 = fst.add_state();

// Set s0 as the start state
fst.set_start(&s0).unwrap();

// Add an arc from s0 to s1
fst.add_arc(&s0, Arc::new(3, 5, IntegerWeight::new(10), s1))
    .unwrap();

// Add an arc from s0 to s2
fst.add_arc(&s0, Arc::new(5, 7, IntegerWeight::new(18), s2))
    .unwrap();

// Set s1 and s2 as final states
fst.set_final(&s1, IntegerWeight::new(31)).unwrap();
fst.set_final(&s2, IntegerWeight::new(45)).unwrap();

// Iter over all the paths in the wFST
for p in fst.paths_iter() {
    println!("{:?}", p);
}

Status

Not all algorithms are (yet) implemented, this is still work in progress.

Modules

Provides algorithms that are generic for all wFST.
Implementation of the transitions inside a wFST.
Implementation of the wFST traits with different data structure.
Provides trait that must be implemented to be able to use generic algorithms.
Implementation of a successful path inside a wFST.
Provides a trait that shall be implemented for all weights stored inside a wFST.
A few utilities to manipulate wFSTs.

Statics

Epsilon label representing the epsilon transition (empty transition).

Type Definitions

Type used for the input label and output label of an arc in a wFST.
Type used to identify a state in a wFST.