rsomics-gradient-trajectory 0.1.0

Gradient/trajectory ANOVA over ordination coordinates (QIIME-style microbiome trajectory analysis): per-group trajectory vectors plus closed-form one-way ANOVA F/p, selectable algorithm — a Rust reimplementation of scikit-bio's skbio.stats.gradient.
Documentation

rsomics-gradient-trajectory

Gradient/trajectory ANOVA over ordination coordinates — the QIIME-style microbiome trajectory analysis of skbio.stats.gradient, as a single fast binary. Given a precomputed ordination (e.g. PCoA), the per-axis proportion explained, and sample metadata, it builds a trajectory through the ordination space for each group in a category and runs one-way ANOVA to test whether the groups differ.

This crate analyses trajectories through an ordination; it does not compute the ordination itself.

Install

cargo install rsomics-gradient-trajectory

Usage

rsomics-gradient-trajectory coords.tsv \
    --prop prop.tsv \
    --metadata meta.tsv \
    --algorithm trajectory \
    --trajectory-categories Treatment \
    --sort-category Time \
    --axes 3
  • coords.tsv — samples × PC axes (#id header then one row per sample), or stdin with -.
  • --prop — proportion-explained vector, one value per axis.
  • --metadata — sample metadata (#SampleID header then id + columns).
  • --algorithmtrajectory (RMS, default), average, first-difference, or window-difference.
  • --sort-category — metadata column whose value orders samples within a group (the gradient axis, e.g. time). Natural-sorted exactly as scikit-bio does.
  • --trajectory-categories — comma-separated categories to analyse (all if omitted).
  • --axes — number of PC axes to use (default 3).
  • --weighted — weight trajectories by spacing in the numeric sort category.
  • --window-size — window for window-difference (default 3).
  • --csv — parse inputs as comma-separated.

Output is one block per category: a line with the ANOVA probability (or the skip message when ANOVA cannot run), then a line per group with its mean and trajectory components.

Origin

This crate is a Rust reimplementation of scikit-bio's skbio.stats.gradient (the GradientANOVA family — RMS trajectory, RMS average, first-difference, and windowed-difference algorithms) and the closed-form one-way ANOVA of scipy.stats.f_oneway. scikit-bio and scipy are BSD-licensed, so their source was read and is cited:

  • Method: ordination-trajectory gradient analysis, as in the QIIME 1 pipeline and the microbiome "movement through ordination space" framework. Caporaso et al., QIIME allows analysis of high-throughput community sequencing data, Nat. Methods 7:335-336 (2010), DOI 10.1038/nmeth.f.303; and the QIIME-2 microbiome study companion, Gigascience 2:16 (2013), DOI 10.1186/2047-217X-2-16.
  • skbio.stats.gradient (scikit-bio 0.7.2, BSD-3-Clause) — algorithm semantics.
  • scipy.stats.f_oneway / scipy.special.betainc (scipy, BSD) — the closed-form ANOVA p-value (F-distribution survival via the regularised incomplete beta function).

The computation is pure linear algebra (vector norms, means, differences) plus a closed-form ANOVA: no RNG, no iterative solver. Results are therefore value-exact against scikit-bio to ~1e-12 (tests/compat.rs runs the differential, with a committed scikit-bio-captured golden as the always-on regression).

License: MIT OR Apache-2.0. Upstream credit: scikit-bio https://scikit-bio.org (BSD-3-Clause), scipy https://scipy.org (BSD-3-Clause).