Expand description
§wass
Optimal transport: move mass from one distribution to another at minimum cost.
§The Problem
Given two probability distributions (piles of sand), find the cheapest way to transform one into the other. The “cost” is how much mass moves times how far it moves.
§Key Functions
| Function | Use Case | Complexity |
|---|---|---|
wasserstein_1d | 1D distributions | O(n log n) |
sinkhorn | General transport (dense) | O(n² × iterations) |
sparse::solve_semidual_l2 | Sparse transport (L2) | O(n² × L-BFGS iter) |
earth_mover_distance | Exact transport | O(n³) |
§Quick Start
use wass::{wasserstein_1d, sinkhorn, sinkhorn_divergence};
use ndarray::array;
// 1D Wasserstein (fast, closed-form)
let a = [0.0, 0.25, 0.5, 0.25];
let b = [0.25, 0.5, 0.25, 0.0];
let w1 = wasserstein_1d(&a, &b);
// General transport with Sinkhorn
let cost = array![[0.0, 1.0], [1.0, 0.0]];
let a = array![0.5, 0.5];
let b = array![0.5, 0.5];
let (plan, distance) = sinkhorn(&a, &b, &cost, 0.1, 100);
// Sinkhorn Divergence (Robust Distribution Metric)
let div = sinkhorn_divergence(&a, &b, &cost, 0.1, 100);§Why Optimal Transport?
- Better distance metric: Unlike KL divergence, OT compares distributions with different supports (no divide-by-zero issues)
- Geometry-aware: Respects the underlying metric space
- Interpolation: Can create meaningful “in-between” distributions
§Applications in ML
- Wasserstein GAN: More stable training via W1 critic
- Domain adaptation: Align source/target feature distributions
- Embedding comparison: Compare document/image embeddings as distributions
- Fair ML: Measure/minimize distribution shift across groups
§Connections
rkhs: MMD vs Wasserstein—both compare distributionslogp: KL divergence vs optimal transportfynch: Differentiable sorting/ranking (includes a Sinkhorn-based sorter)sparse: Sparse OT with L2 regularization (interpretable alignments)
§What Can Go Wrong
- Sinkhorn not converging: Decrease epsilon or increase iterations.
- Numerical overflow: Large cost/small epsilon → exp overflow. Scale costs.
- Marginal mismatch: Sinkhorn assumes both margins sum to 1. Normalize inputs.
- Sliced approximation bias: Fewer projections = noisier estimate.
- 1D vs general:
wasserstein_1donly works for 1D histograms on same bins.
§References
- Kantorovich (1942). “On the Translocation of Masses”
- Cuturi (2013). “Sinkhorn Distances: Lightspeed Computation of Optimal Transport”
- Peyré & Cuturi (2019). “Computational Optimal Transport”
- Memoli (2011). “Gromov-Wasserstein Distances and Metric Measure Spaces”
- Blondel, Seguy, Rolet (2018). “Smooth and Sparse OT” (AISTATS)
- Chizat et al. (2018). “Scaling Algorithms for Unbalanced OT Problems”
- Sejourne et al. (2023). “Unbalanced OT Meets Sliced-Wasserstein”
- Rabin et al. (2012). “Wasserstein Barycenter and Its Application to Texture Mixing”
Re-exports§
pub use flow::flow_drift;pub use flow::VectorField;
Modules§
- flow
- Flow / drift primitives.
- gromov
- Entropic Gromov-Wasserstein optimal transport.
- semidiscrete
- Semidiscrete optimal transport (SD-OT) primitives.
- sparse
- Sparse optimal transport via L2 regularization.
Enums§
- Error
- Optimal Transport error variants.
Functions§
- earth_
mover_ distance - Approximate Earth Mover’s Distance (EMD) via log-domain Sinkhorn.
- euclidean_
cost_ matrix - Create Euclidean cost matrix from point positions.
- sinkhorn
- Sinkhorn algorithm for entropic-regularized optimal transport (matrix scaling).
- sinkhorn_
divergence Deprecated - Sinkhorn Divergence (de-biased entropic OT).
- sinkhorn_
divergence_ general - Correct Sinkhorn divergence for different supports.
- sinkhorn_
divergence_ same_ support - De-biased Sinkhorn divergence for distributions on the same support.
- sinkhorn_
log - Log-domain Sinkhorn algorithm for entropic optimal transport.
- sinkhorn_
log_ with_ convergence - Sinkhorn in log-space with a convergence check on marginal constraints.
- sinkhorn_
with_ convergence - Sinkhorn with convergence check.
- sliced_
wasserstein - Sliced Wasserstein distance – a scalable approximation for high dimensions.
- sq_
euclidean_ cost_ matrix - Create squared Euclidean cost matrix from point positions.
- unbalanced_
sinkhorn_ divergence_ general - Unbalanced Sinkhorn divergence for different supports.
- unbalanced_
sinkhorn_ divergence_ same_ support - Unbalanced Sinkhorn divergence for measures on the same support.
- unbalanced_
sinkhorn_ log_ with_ convergence - Unbalanced Sinkhorn in log-space (KL-penalized marginals), scaling form.
- wasserstein_
1d - 1D Wasserstein distance (Earth Mover’s Distance) via CDF integration.
Type Aliases§
- Result
- Result type for Optimal Transport operations.