oxicuda-seq
Sequence models and structured prediction -- HMMs, CRFs, Kalman filters, alignment, beam search, and stochastic decoders in pure Rust.
Part of the OxiCUDA project.
Overview
oxicuda-seq is the sequence-modelling volume of the OxiCUDA stack. It
collects the canonical algorithms for discrete-state sequence models
(HMMs, CRFs, MEMMs, structured SVMs), continuous-state filtering
(Kalman / EKF / UKF, particle filters, RTS smoothing), pairwise / grid MRFs,
classical alignment, and the decoding strategies (Viterbi, beam search,
top-k / nucleus / typical sampling) that sit on top of them.
All algorithms are implemented in pure Rust with no external linear-algebra
dependencies. GPU PTX kernel generators in ptx_kernels provide kernel
strings for the operations whose inner loops are amenable to direct kernel
mapping, parameterised on SM compute capability.
The numerical kernels in this crate prefer straight integer-indexed loops
over iterator chains because most bodies touch multiple parallel arrays
indexed by the same (state, observation, time) triplets, which is why
clippy::needless_range_loop is allowed crate-wide.
Modules
| Module | Description |
|---|---|
hmm |
Discrete and Gaussian HMMs, forward-backward, Viterbi, Baum-Welch, variational EM, semi-Markov |
crf |
Linear-chain Conditional Random Fields with L-BFGS-B training, Viterbi decoding, skip-chain extension |
memm |
Maximum-Entropy Markov Models |
ssvm |
Structured SVM (linear-chain) with cutting-plane optimisation |
beam |
Generic beam search with length normalisation and diverse-beam decoding |
decoders |
Stochastic decoders: top-k, nucleus (top-p), typical sampling |
alignment |
Needleman-Wunsch, Smith-Waterman, Gotoh affine-gap, Hirschberg |
grid_crf |
Pairwise 2D CRF with mean-field variational inference |
kalman |
Linear / Extended / Unscented Kalman filter, RTS smoother, EM parameter learning, particle filter |
mrf |
General MRF + Ising model, Gibbs sampler, loopy belief propagation |
metrics |
Token / sequence accuracy, edit distance, BLEU, chrF, perplexity, log-loss |
handle |
SeqHandle, SmVersion, LcgRng |
ptx_kernels |
GPU PTX kernels for sequence-model operations |
error |
SeqError / SeqResult |
Method Coverage
Hidden Markov Models (hmm)
- Discrete and Gaussian emissions
- Forward-backward with scaling for numerical stability
- Viterbi decoding
- Baum-Welch (EM) parameter learning
- Variational Bayes EM with Dirichlet priors
- Hidden semi-Markov extension
Conditional Random Fields (crf)
- Linear-chain CRF: forward-backward in score space, Viterbi decoding
- L-BFGS-B training
- Skip-chain extension for long-range dependencies
State-space filters (kalman)
- Linear Kalman filter and RTS smoother
- Extended Kalman Filter (EKF)
- Unscented Kalman Filter (UKF)
- Particle filter
- EM parameter learning for linear-Gaussian models
Sequence alignment (alignment)
- Needleman-Wunsch (global)
- Smith-Waterman (local)
- Gotoh (affine-gap penalty)
- Hirschberg (linear-space)
Decoding (beam + decoders)
- Beam search with length normalisation and diverse-beam penalty
- Top-k sampling, nucleus / top-p sampling, typical sampling
Graphical models (grid_crf + mrf)
- Pairwise 2D CRF with mean-field variational inference
- General MRF + Ising model
- Gibbs sampler and loopy belief propagation
Quick Start
use HmmDiscrete;
use viterbi;
use SeqResult;
Status
Alpha -- 20,887 SLoC, 706 passing tests. API may evolve before v1.0.
License
Apache-2.0 -- (C) 2026 COOLJAPAN OU (Team KitaSan)