[−][src]Crate rustfst
Rustfst
Rust implementation of Weighted Finite States Transducers.
Rustfst is a library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Weighted finite-state transducers are automata where each transition has an input label, an output label, and a weight. The more familiar finite-state acceptor is represented as a transducer with each transition's input and output label equal. Finite-state acceptors are used to represent sets of strings (specifically, regular or rational sets); finite-state transducers are used to represent binary relations between pairs of strings (specifically, rational transductions). The weights can be used to represent the cost of taking a particular transition.
FSTs have key applications in speech recognition and synthesis, machine translation, optical character recognition, pattern matching, string processing, machine learning, information extraction and retrieval among others. Often a weighted transducer is used to represent a probabilistic model (e.g., an n-gram model, pronunciation model). FSTs can be optimized by determinization and minimization, models can be applied to hypothesis sets (also represented as automata) or cascaded by finite-state composition, and the best results can be selected by shortest-path algorithms.
References
Implementation heavily inspired from Mehryar Mohri's, Cyril Allauzen's and Michael Riley's work :
- Weighted automata algorithms
- The design principles of a weighted finite-state transducer library
- OpenFst: A general and efficient weighted finite-state transducer library
- Weighted finite-state transducers in speech recognition
Example
// Creates a empty wFST let mut fst = VectorFst::new(); // Add some states let s0 = fst.add_state(); let s1 = fst.add_state(); let s2 = fst.add_state(); // Set s0 as the start state fst.set_start(s0).unwrap(); // Add an arc from s0 to s1 fst.add_arc(s0, Arc::new(3, 5, IntegerWeight::new(10), s1)) .unwrap(); // Add an arc from s0 to s2 fst.add_arc(s0, Arc::new(5, 7, IntegerWeight::new(18), s2)) .unwrap(); // Set s1 and s2 as final states fst.set_final(s1, IntegerWeight::new(31)).unwrap(); fst.set_final(s2, IntegerWeight::new(45)).unwrap(); // Iter over all the paths in the wFST for p in fst.paths_iter() { println!("{:?}", p); }
Status
Not all algorithms are (yet) implemented, this is still work in progress.
Modules
algorithms | Provides algorithms that are generic to all wFST. |
fst_impls | Implementation of the wFST traits with different data structures. |
fst_traits | Provides traits that must be implemented to be able to use generic algorithms. |
semirings | Provides a trait that shall be implemented for all weights stored inside a wFST. |
utils | A few utilities to manipulate wFSTs. |
Macros
fst | Creates a linear Fst containing the arguments. |
fst_path | Creates a Path containing the arguments. |
symt | Creates a |
Structs
Arc | Structure representing a transition from a state to another state in a FST. |
DrawingConfig | Struct to configure how the FST should be drawn. |
FstPath | Structure representing a path in a FST (list of input labels, list of output labels and total weight). |
SymbolTable | A symbol table stores a bidirectional mapping between arc labels and "symbols" (strings). |
Constants
EPS_LABEL | Epsilon label representing the epsilon transition (empty transition) = |
EPS_SYMBOL | Epsilon symbol representing the epsilon transition (empty transition) = |
Type Definitions
Label | Type used for the input label and output label of an arc in a wFST -> usize |
StateId | Type used to identify a state in a wFST -> usize |
Symbol | Symbol to map in the Symbol Table -> String |