flowmatch 0.1.6

Flow matching primitives (ndarray-first; backend-agnostic) with semidiscrete FM and RFM experiments.
Documentation

flowmatch

crates.io Documentation CI

Flow matching.

Problem

You have a set of target points -- protein backbone angles, earthquake epicenters, token embeddings -- and want to train a vector field that transforms Gaussian noise into samples from the same distribution. Flow matching [1] does this by regressing a conditional vector field along straight (or geodesic) interpolation paths, then sampling via ODE integration.

This library provides the training loop, OT-based coupling, ODE integration, and evaluation metrics. It works on flat spaces and on Riemannian manifolds.

Examples

Transport noise to discrete targets (simplest case). Semidiscrete FM pairs Gaussian noise with fixed target points via optimal transport, trains a linear conditional field, and integrates an ODE to produce samples:

cargo run --release --example sd_fm_semidiscrete_linear
n=16 d=8
pot_cfg: steps=2000 batch=1024 seed=7
fm_cfg:  steps=800 batch=256 lr=0.008 seed=9 euler_steps=40
sample_mse_to_assigned_y = 0.0367

Straighter trajectories via minibatch OT. Rectified flow matching [7] uses Sinkhorn coupling within each minibatch so that noise-to-data paths cross less, reducing integration error:

cargo run --release --example rfm_minibatch_ot_linear
sample_mse_to_assigned_y = 0.0684

Protein torsion angles on a torus. Backbone phi/psi angles live on S1 x S1. This example trains on real angles from PDB 1BPI (BPTI), then measures sample quality by JS divergence between generated and observed Ramachandran histograms:

cargo run --release --example rfm_protein_torsions_1bpi
PDB 1BPI φ/ψ (n=56) as a torus via R^4 embedding
Ramachandran histogram JS divergence (lower is better):
- baseline (Gaussian decode): 0.6391
- trained  (RFM+minibatch OT): 0.4105
- ratio trained/baseline: 0.642

Earthquake locations on a sphere. USGS M6+ earthquake epicenters (2024) mapped to S^2. Evaluation uses entropic OT cost between generated and observed locations:

cargo run --release --example rfm_usgs_earthquakes_sphere
USGS earthquakes (n=50), embedding=R^3 with S^2 projection
OT cost (lower is better):
- baseline (near-noise): 0.6496
- trained  (RFM+minibatch OT): 0.3129
- ratio trained/baseline: 0.482

Some generated samples (lat, lon):
   0: lat=  12.63°, lon= -104.96°
   1: lat=  58.20°, lon=  169.16°
   2: lat= -13.11°, lon= -167.62°
   3: lat= -35.47°, lon=  -79.28°

Geodesics on the Poincare ball. Riemannian ODE integration on hyperbolic space, using the skel::Manifold trait implemented by hyperball:

cargo run --release --example rfm_poincare_geodesic_ode --features riemannian

See examples/ for more, including discrete CTMC, two-moons, conditional 2D, Burn backend variants (--features burn), Riemannian Poincare disk (--features riemannian), and cluster/community evaluation (--features sheaf-evals).

What it provides

Training: Semidiscrete FM, rectified flow matching with minibatch OT coupling, time schedules (uniform, U-shaped, logit-normal).

Sampling: Fixed-step ODE integrators (Euler, Heun) for Euclidean and Riemannian manifolds.

Coupling: Sinkhorn OT pairing, greedy matching, partial/selective pairing for outlier handling.

Discrete FM: CTMC generator scaffolding with cosine-squared schedule [3], conditional probability paths, conditional rate matrices.

Evaluation: JS divergence on histograms, entropic OT cost.

Status

MSRV: 1.80.

Tests

cargo test -p flowmatch                        # 85 tests
cargo test -p flowmatch --features burn        # + burn backend tests
cargo test -p flowmatch --features riemannian  # + Riemannian manifold tests
cargo test -p flowmatch --features sheaf-evals # + cluster/community eval tests

References

  1. Lipman et al., Flow Matching for Generative Modeling (2022)
  2. Lipman et al., Flow Matching Guide and Code (2024) -- comprehensive tutorial
  3. Gat et al., Discrete Flow Matching (NeurIPS 2024) -- CTMC-based discrete FM
  4. Chen & Lipman, Riemannian Flow Matching on General Geometries (2023)
  5. de Kruiff et al., Pullback Flow Matching on Data Manifolds (2024) -- FM on implicit manifolds without closed-form exp/log maps
  6. Sherry & Smets, Flow Matching on Lie Groups (2025) -- specialization to SO(3) and SE(3)
  7. Liu et al., Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow (2022) -- rectified flow

License

MIT OR Apache-2.0