1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
//! # Neural Transducer Module
//!
//! This module provides WFST-based infrastructure for Neural Transducers (RNN-T/Transducer models),
//! which have become the dominant architecture for production ASR systems.
//!
//! ## Architecture
//!
//! Neural Transducers consist of three components:
//!
//! ```text
//! ┌─────────────┐ ┌─────────────┐
//! │ Encoder │ │ Predictor │
//! │ (Conformer) │ │ (LSTM) │
//! └──────┬──────┘ └──────┬──────┘
//! │ │
//! └───────┬───────────┘
//! ▼
//! ┌────────────┐
//! │ Joiner │
//! │ (FFN) │
//! └──────┬─────┘
//! ▼
//! P(y|x,history)
//! ```
//!
//! The WFST framework enables:
//! - Efficient beam search decoding with external LM composition
//! - Differentiable loss computation (k2-style)
//! - Contextual biasing via WFST composition
//! - Unified framework for CTC and RNN-T
//!
//! ## References
//!
//! - [Sequence Transduction with Recurrent Neural Networks (Graves, 2012)](https://arxiv.org/abs/1211.3711)
//! - [k2-fsa/k2: FSA/FST algorithms, differentiable](https://github.com/k2-fsa/k2)
//! - [Advanced Long-Content Speech Recognition with Factorized Neural Transducer](https://arxiv.org/abs/2403.13423)
pub use *;
pub use *;
pub use *;
pub use *;
pub use *;