Skip to main content

Crate arcweight

Crate arcweight 

Source
Expand description

§ArcWeight - Weighted Finite State Transducers for Rust

A high-performance, type-safe library for constructing, combining, optimizing, and searching weighted finite state transducers (WFSTs) and automata.

§Overview

ArcWeight provides a toolkit for working with finite state machines, from simple acceptors to complex weighted transducers. Built with performance and correctness in mind, it leverages Rust’s type system to ensure compile-time safety while maintaining the flexibility needed for FST operations.

The library implements the theoretical framework of weighted finite-state transducers as formalized by Mohri, providing efficient algorithms for composition, determinization, minimization, and shortest-path computation over arbitrary semirings.

§Key Features

§FST Support

  • Multiple implementations: fst::VectorFst for construction, fst::ConstFst for deployment
  • Lazy evaluation: On-demand computation with fst::LazyFstImpl
  • Memory efficiency: Compact representations and caching strategies

§Extensible Semiring Library

§Extensive Algorithm Library

§Property Analysis

  • Automatic property tracking: Detect determinism, cyclicity, connectivity
  • Optimization guidance: Algorithm selection based on FST properties
  • Efficient computation: $O(|V| + |E|)$ property analysis

§Quick Start

use arcweight::prelude::*;

// Build a simple acceptor for the pattern "ab+"
let mut fst = VectorFst::<TropicalWeight>::new();
let s0 = fst.add_state();
let s1 = fst.add_state();
let s2 = fst.add_state();

fst.set_start(s0);
fst.set_final(s2, TropicalWeight::one());

// Add transitions: 'a' -> 'b' -> 'b'*
fst.add_arc(s0, Arc::new('a' as u32, 'a' as u32, TropicalWeight::new(1.0), s1));
fst.add_arc(s1, Arc::new('b' as u32, 'b' as u32, TropicalWeight::new(0.5), s2));
fst.add_arc(s2, Arc::new('b' as u32, 'b' as u32, TropicalWeight::new(0.5), s2));

// Find shortest accepting path
let shortest: VectorFst<TropicalWeight> = shortest_path(&fst, ShortestPathConfig::default())?;

// Minimize the FST
let minimal: VectorFst<TropicalWeight> = minimize(&fst)?;

§Common Use Cases

§Building a Transducer

use arcweight::prelude::*;

// Create a transducer that uppercases ASCII letters
let mut fst = VectorFst::<TropicalWeight>::new();
let state = fst.add_state();
fst.set_start(state);
fst.set_final(state, TropicalWeight::one());

// Add lowercase->uppercase mappings
for c in b'a'..=b'z' {
    let lower = c as u32;
    let upper = (c - 32) as u32;  // ASCII uppercase
    fst.add_arc(state, Arc::new(lower, upper, TropicalWeight::one(), state));
}

§Composing FSTs

use arcweight::prelude::*;

// Compose two transducers to create a pipeline
fn create_pipeline() -> Result<VectorFst<TropicalWeight>> {
    let fst1 = VectorFst::<TropicalWeight>::new();  // First transducer
    let fst2 = VectorFst::<TropicalWeight>::new();  // Second transducer

    // Compose: output of fst1 feeds into input of fst2
    compose_default(&fst1, &fst2)
}

§Module Organization

The library is organized into focused modules, each handling a specific aspect of FST functionality:

§Core Modules

  • arc - Arc types representing FST transitions
  • fst - FST implementations and traits
  • semiring - Weight types and algebraic operations

§Algorithm Modules

§Support Modules

  • io - Serialization and file format support
  • utils - Symbol tables and utility types
  • prelude - Convenient re-exports of common types

§Performance Guidelines

  1. Choose the right FST type: Use fst::VectorFst for construction, fst::ConstFst for deployment
  2. Leverage properties: Let the library detect and optimize based on FST properties
  3. Batch operations: Combine multiple operations when possible
  4. Use lazy evaluation: For large FSTs, consider lazy implementations

§Additional Topics

  • Custom Semirings: Implement the Semiring trait for domain-specific weights
  • Lazy Computation: Use fst::LazyFstImpl for on-demand arc generation
  • Memory Optimization: Employ fst::CompactFst for memory-constrained environments
  • Parallel Algorithms: Some algorithms support parallel execution (feature-gated)

§References

The algorithms and data structures in this library are based on foundational work in weighted finite-state transducer theory:

  • Mehryar Mohri. 1997. Finite-state transducers in language and speech processing. Computational Linguistics 23, 2 (June 1997), 269-311.

  • Mehryar Mohri, Fernando Pereira, and Michael Riley. 2002. Weighted finite-state transducers in speech recognition. Computer Speech & Language 16, 1 (2002), 69-88.

  • Cyril Allauzen, Michael Riley, Johan Schalkwyk, Wojciech Skut, and Mehryar Mohri. 2007. OpenFst: A general and efficient weighted finite-state transducer library. In Proceedings of the 12th International Conference on Implementation and Application of Automata (CIAA’07). Springer-Verlag, Berlin, Heidelberg, 11-23.

Re-exports§

pub use arc::Arc;
pub use arc::ArcIterator;
pub use fst::ExpandedFst;
pub use fst::Fst;
pub use fst::MutableFst;
pub use fst::VectorFst;
pub use semiring::BooleanWeight;
pub use semiring::ProbabilityWeight;
pub use semiring::Semiring;
pub use semiring::TropicalWeight;

Modules§

algorithms
Algorithms for weighted finite-state transducers.
arc
Arc
fst
Finite State Transducer (FST) implementations for weighted automata.
gpu
GPU acceleration for FST operations.
io
FST serialization and deserialization in multiple formats.
macros
Macros
optimization
Performance optimization utilities for ArcWeight.
prelude
Prelude
properties
FST properties computation and tracking.
semiring
Semiring implementations for weighted finite-state transducers.
utils
Utility types and data structures for FST operations.

Macros§

fst
Creates a VectorFst<TropicalWeight> from a declarative specification.
fst_path
Creates an FstPath representing a path through an FST.
symt
Creates a SymbolTable from a list of symbol strings.

Enums§

Error
Library-wide error type for FST operations.

Type Aliases§

Result
Library-wide result type for fallible FST operations.