[][src]Crate rustfst

Rust implementation of Weighted Finite States Transducers.

Rustfst is a library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Weighted finite-state transducers are automata where each transition has an input label, an output label, and a weight. The more familiar finite-state acceptor is represented as a transducer with each transition's input and output label equal. Finite-state acceptors are used to represent sets of strings (specifically, regular or rational sets); finite-state transducers are used to represent binary relations between pairs of strings (specifically, rational transductions). The weights can be used to represent the cost of taking a particular transition.

FSTs have key applications in speech recognition and synthesis, machine translation, optical character recognition, pattern matching, string processing, machine learning, information extraction and retrieval among others. Often a weighted transducer is used to represent a probabilistic model (e.g., an n-gram model, pronunciation model). FSTs can be optimized by determinization and minimization, models can be applied to hypothesis sets (also represented as automata) or cascaded by finite-state composition, and the best results can be selected by shortest-path algorithms.

fst

References

Implementation heavily inspired from Mehryar Mohri's, Cyril Allauzen's and Michael Riley's work :

Example

use anyhow::Result;
use rustfst::prelude::*;
use rustfst::algorithms::determinize::{DeterminizeType, determinize};
use rustfst::algorithms::rm_epsilon::rm_epsilon;

fn main() -> Result<()> {
    // Creates a empty wFST
    let mut fst = VectorFst::<TropicalWeight>::new();

    // Add some states
    let s0 = fst.add_state();
    let s1 = fst.add_state();
    let s2 = fst.add_state();

    // Set s0 as the start state
    fst.set_start(s0)?;

    // Add a transition from s0 to s1
    fst.add_tr(s0, Tr::new(3, 5, 10.0, s1))?;

    // Add a transition from s0 to s2
    fst.add_tr(s0, Tr::new(5, 7, 18.0, s2))?;

    // Set s1 and s2 as final states
    fst.set_final(s1, 31.0)?;
    fst.set_final(s2, 45.0)?;

    // Iter over all the paths in the wFST
    for p in fst.paths_iter() {
         println!("{:?}", p);
    }

    // A lot of operations are available to modify/optimize the FST.
    // Here are a few examples :

    // - Remove useless states.
    connect(&mut fst)?;

    // - Optimize the FST by merging states with the same behaviour.
    minimize(&mut fst)?;

    // - Copy all the input labels in the output.
    project(&mut fst, ProjectType::ProjectInput);

    // - Remove epsilon transitions.
    rm_epsilon(&mut fst)?;

    // - Compute an equivalent FST but deterministic.
    fst = determinize(&fst)?;

    Ok(())
}

Status

A big number of algorithms are already implemented. The main one missing is the Composition.

Re-exports

pub use self::trs::Trs;
pub use self::trs::TrsConst;
pub use self::trs::TrsVec;

Modules

algorithms

Provides algorithms that are generic to all wFST.

fst_impls

Implementation of the wFST traits with different data structures.

fst_properties

Provides the FstProperties struct and some utils functions around it. Useful to assert some properties on a Fst.

fst_traits

Provides traits that must be implemented to be able to use generic algorithms.

prelude

Module re-exporting most of the objects from this crate.

semirings

Provides a trait that shall be implemented for all weights stored inside a wFST.

trs
trs_iter_mut
utils

A few utilities to manipulate wFSTs.

Macros

fst

Creates a linear Fst containing the arguments.

fst_path

Creates a Path containing the arguments.

symt

Creates a SymbolTable containing the arguments.

Structs

DrawingConfig

Struct to configure how the FST should be drawn.

FstPath

Structure representing a path in a FST (list of input labels, list of output labels and total weight).

SymbolTable

A symbol table stores a bidirectional mapping between transition labels and "symbols" (strings).

Tr

Structure representing a transition from a state to another state in a FST.

Constants

EPS_LABEL

Epsilon label representing the epsilon transition (empty transition) = 0.

EPS_SYMBOL

Epsilon symbol representing the epsilon transition (empty transition) = <eps>.

KDELTA

A representable float near .001. (Used in Quantize)

KSHORTESTDELTA

Statics

NO_LABEL
NO_STATE_ID

Traits

Semiring

For some operations, the weight set associated to a wFST must have the structure of a semiring. (S, +, *, 0, 1) is a semiring if (S, +, 0) is a commutative monoid with identity element 0, (S, *, 1) is a monoid with identity element 1, * distributes over +, 0 is an annihilator for *. Thus, a semiring is a ring that may lack negation. For more information : https://cs.nyu.edu/~mohri/pub/hwa.pdf

Functions

check_path_in_fst

Check if a FstPath can be generated by the given Fst.

Type Definitions

Label

Type used for the input label and output label of a transition in a wFST -> usize

StateId

Type used to identify a state in a wFST -> usize

Symbol

Symbol to map in the Symbol Table -> String