Crate rustfst

source ·
Expand description

Rust implementation of Weighted Finite States Transducers.

Rustfst is a library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Weighted finite-state transducers are automata where each transition has an input label, an output label, and a weight. The more familiar finite-state acceptor is represented as a transducer with each transition’s input and output label equal. Finite-state acceptors are used to represent sets of strings (specifically, regular or rational sets); finite-state transducers are used to represent binary relations between pairs of strings (specifically, rational transductions). The weights can be used to represent the cost of taking a particular transition.

FSTs have key applications in speech recognition and synthesis, machine translation, optical character recognition, pattern matching, string processing, machine learning, information extraction and retrieval among others. Often a weighted transducer is used to represent a probabilistic model (e.g., an n-gram model, pronunciation model). FSTs can be optimized by determinization and minimization, models can be applied to hypothesis sets (also represented as automata) or cascaded by finite-state composition, and the best results can be selected by shortest-path algorithms.

fst

References

Implementation heavily inspired from Mehryar Mohri’s, Cyril Allauzen’s and Michael Riley’s work :

Example

use anyhow::Result;
use rustfst::prelude::*;
use rustfst::algorithms::determinize::{DeterminizeType, determinize};
use rustfst::algorithms::rm_epsilon::rm_epsilon;

fn main() -> Result<()> {
    // Creates a empty wFST
    let mut fst = VectorFst::<TropicalWeight>::new();

    // Add some states
    let s0 = fst.add_state();
    let s1 = fst.add_state();
    let s2 = fst.add_state();

    // Set s0 as the start state
    fst.set_start(s0)?;

    // Add a transition from s0 to s1
    fst.add_tr(s0, Tr::new(3, 5, 10.0, s1))?;

    // Add a transition from s0 to s2
    fst.add_tr(s0, Tr::new(5, 7, 18.0, s2))?;

    // Set s1 and s2 as final states
    fst.set_final(s1, 31.0)?;
    fst.set_final(s2, 45.0)?;

    // Iter over all the paths in the wFST
    for p in fst.paths_iter() {
         println!("{:?}", p);
    }

    // A lot of operations are available to modify/optimize the FST.
    // Here are a few examples :

    // - Remove useless states.
    connect(&mut fst)?;

    // - Optimize the FST by merging states with the same behaviour.
    minimize(&mut fst)?;

    // - Copy all the input labels in the output.
    project(&mut fst, ProjectType::ProjectInput);

    // - Remove epsilon transitions.
    rm_epsilon(&mut fst)?;

    // - Compute an equivalent FST but deterministic.
    fst = determinize(&fst)?;

    Ok(())
}

Re-exports

pub use self::trs::Trs;
pub use self::trs::TrsConst;
pub use self::trs::TrsVec;

Modules

Provides algorithms that are generic to all Fst.
Implementation of the wFST traits with different data structures.
Provides the FstProperties struct and some utils functions around it. Useful to assert some properties on a Fst.
Provides traits that must be implemented to be able to use generic algorithms.
Module re-exporting most of the objects from this crate.
Provides a trait that shall be implemented for all weights stored inside a wFST.
A few utilities to manipulate wFSTs.

Macros

Creates a linear Fst containing the arguments.
Creates a Path containing the arguments.
Creates a SymbolTable containing the arguments.

Structs

Struct to configure how the FST should be drawn.
Structure representing a path in a FST (list of input labels, list of output labels and total weight).
Wrapper around FstPath to nicely handle SymbolTables.
A symbol table stores a bidirectional mapping between transition labels and “symbols” (strings).
Structure representing a transition from a state to another state in a FST.

Enums

Constants

Epsilon label representing the epsilon transition (empty transition) = 0.
Epsilon symbol representing the epsilon transition (empty transition) = <eps>.
A representable float near .001. (Used in Quantize)

Statics

Traits

The weight on an Fst must implement the Semiring trait. Indeed, the weight set associated to a Fst must have the structure of a semiring. (S, +, *, 0, 1) is a semiring if (S, +, 0) is a commutative monoid with identity element 0, (S, *, 1) is a monoid with identity element 1, * distributes over +, 0 is an annihilator for *. Thus, a semiring is a ring that may lack negation. For more information : https://cs.nyu.edu/~mohri/pub/hwa.pdf

Functions

Check if a FstPath can be generated by the given Fst.

Type Definitions

Type used for the input label and output label of a transition in a wFST -> usize
Type used to identify a state in a wFST -> usize
Symbol to map in the Symbol Table -> String