Expand description
Rust implementation of Weighted Finite States Transducers.
Rustfst is a library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Weighted finite-state transducers are automata where each transition has an input label, an output label, and a weight. The more familiar finite-state acceptor is represented as a transducer with each transition’s input and output label equal. Finite-state acceptors are used to represent sets of strings (specifically, regular or rational sets); finite-state transducers are used to represent binary relations between pairs of strings (specifically, rational transductions). The weights can be used to represent the cost of taking a particular transition.
FSTs have key applications in speech recognition and synthesis, machine translation, optical character recognition, pattern matching, string processing, machine learning, information extraction and retrieval among others. Often a weighted transducer is used to represent a probabilistic model (e.g., an n-gram model, pronunciation model). FSTs can be optimized by determinization and minimization, models can be applied to hypothesis sets (also represented as automata) or cascaded by finite-state composition, and the best results can be selected by shortest-path algorithms.
References
Implementation heavily inspired from Mehryar Mohri’s, Cyril Allauzen’s and Michael Riley’s work :
- Weighted automata algorithms
- The design principles of a weighted finite-state transducer library
- OpenFst: A general and efficient weighted finite-state transducer library
- Weighted finite-state transducers in speech recognition
Example
use anyhow::Result;
use rustfst::prelude::*;
use rustfst::algorithms::determinize::{DeterminizeType, determinize};
use rustfst::algorithms::rm_epsilon::rm_epsilon;
fn main() -> Result<()> {
// Creates a empty wFST
let mut fst = VectorFst::<TropicalWeight>::new();
// Add some states
let s0 = fst.add_state();
let s1 = fst.add_state();
let s2 = fst.add_state();
// Set s0 as the start state
fst.set_start(s0)?;
// Add a transition from s0 to s1
fst.add_tr(s0, Tr::new(3, 5, 10.0, s1))?;
// Add a transition from s0 to s2
fst.add_tr(s0, Tr::new(5, 7, 18.0, s2))?;
// Set s1 and s2 as final states
fst.set_final(s1, 31.0)?;
fst.set_final(s2, 45.0)?;
// Iter over all the paths in the wFST
for p in fst.paths_iter() {
println!("{:?}", p);
}
// A lot of operations are available to modify/optimize the FST.
// Here are a few examples :
// - Remove useless states.
connect(&mut fst)?;
// - Optimize the FST by merging states with the same behaviour.
minimize(&mut fst)?;
// - Copy all the input labels in the output.
project(&mut fst, ProjectType::ProjectInput);
// - Remove epsilon transitions.
rm_epsilon(&mut fst)?;
// - Compute an equivalent FST but deterministic.
fst = determinize(&fst)?;
Ok(())
}
Re-exports
Modules
Provides algorithms that are generic to all Fst.
Implementation of the wFST traits with different data structures.
Provides the FstProperties
struct and some utils functions around it.
Useful to assert some properties on a Fst.
Provides traits that must be implemented to be able to use generic algorithms.
Module re-exporting most of the objects from this crate.
Provides a trait that shall be implemented for all weights stored inside a wFST.
A few utilities to manipulate wFSTs.
Macros
Creates a linear Fst containing the arguments.
Creates a Path containing the arguments.
Creates a SymbolTable
containing the arguments.
Structs
Struct to configure how the FST should be drawn.
Structure representing a path in a FST (list of input labels, list of output labels and total weight).
A symbol table stores a bidirectional mapping between transition labels and “symbols” (strings).
Structure representing a transition from a state to another state in a FST.
Enums
Constants
Epsilon label representing the epsilon transition (empty transition) = 0
.
Epsilon symbol representing the epsilon transition (empty transition) = <eps>
.
A representable float near .001. (Used in Quantize)
Statics
Traits
The weight on an Fst must implement the Semiring
trait.
Indeed, the weight set associated to a Fst must have the structure of a semiring.
(S, +, *, 0, 1)
is a semiring if (S, +, 0)
is a commutative monoid with identity element 0,
(S, *, 1)
is a monoid with identity element 1
, *
distributes over +
,
0
is an annihilator for *
.
Thus, a semiring is a ring that may lack negation.
For more information : https://cs.nyu.edu/~mohri/pub/hwa.pdf
Functions
Check if a FstPath can be generated by the given Fst.