Crate limma

Expand description

limma — a pure-Rust port of the Bioconductor limma package (Linear Models for Microarray and RNA-seq Data), validated function by function against limma 3.68.3 running on R 4.6.0.

The crate has no BLAS/LAPACK, Python or R dependency at runtime: all linear algebra and special functions are implemented in pure Rust (see linalg and special), with distribution functions from statrs.

The current scope is the statistical core of limma — lmFit (least squares) -> contrasts.fit -> eBayes -> topTable/topTableF — with further functions ported incrementally. See README.md for the per-function status table. Microarray IO readers, plotting and annotation helpers are out of scope for a pure-Rust port.

Re-exports§

pub use arrayweights::array_weights;
pub use arrayweights::array_weights_gene_by_gene;
pub use arrayweights::array_weights_prwts_reml;
pub use arrayweights::array_weights_quick;
pub use auroc::auroc;
pub use avereps::avearrays;
pub use avereps::avereps;
pub use batch::remove_batch_effect;
pub use beadcountweights::bead_count_weights;
pub use beadcountweights::BeadCountWeights;
pub use beadcountweights::BeadDispersion;
pub use bwss::bwss;
pub use bwss::bwss_matrix;
pub use bwss::Bwss;
pub use classifytestsf::classify_tests_f;
pub use classifytestsf::classify_tests_fstat;
pub use combine::make_unique;
pub use contrasts::contrasts_fit;
pub use contrasts::make_contrasts;
pub use cumoverlap::cum_overlap;
pub use cumoverlap::CumOverlap;
pub use decidetests::classify_tests_p;
pub use decidetests::decide_tests;
pub use decidetests::decide_tests_pvalues;
pub use decidetests::p_adjust;
pub use decidetests::Adjust;
pub use decidetests::DecideMethod;
pub use decidetests::TestResults;
pub use decidetests::TestResultsSummary;
pub use detectionpvalues::detection_p_values;
pub use diffsplice::diff_splice;
pub use diffsplice::top_splice;
pub use diffsplice::DiffSplice;
pub use diffsplice::SpliceSort;
pub use diffsplice::SpliceTest;
pub use diffsplice::TopSpliceRow;
pub use dups::avedups;
pub use dups::duplicate_correlation;
pub use dups::uniquegenelist;
pub use dups::unwrapdups;
pub use dups::DupCorOutput;
pub use ebayes::ebayes;
pub use fit::is_fullrank;
pub use fit::lmfit;
pub use fit::lmfit_weighted;
pub use fit::non_estimable;
pub use fit::MArrayLM;
pub use fitgamma::fit_gamma_intercept;
pub use fitmixture::fitmixture;
pub use fitmixture::FitMixture;
pub use genas::genas;
pub use genas::Genas;
pub use genas::GenasSubset;
pub use geneset::camera;
pub use geneset::camera_pr;
pub use geneset::contrast_as_coef;
pub use geneset::fry;
pub use geneset::gene_set_test;
pub use geneset::ids2indices;
pub use geneset::inter_gene_correlation;
pub use geneset::mroast;
pub use geneset::rank_sum_test_with_correlation;
pub use geneset::roast;
pub use geneset::romer;
pub use geneset::top_romer;
pub use geneset::wilcox_gst;
pub use geneset::Alternative;
pub use geneset::CameraResult;
pub use geneset::ContrastAsCoef;
pub use geneset::Direction;
pub use geneset::FryResult;
pub use geneset::FrySort;
pub use geneset::MroastRow;
pub use geneset::Roast;
pub use geneset::RomerAlternative;
pub use geneset::RomerRow;
pub use geneset::RomerStatistic;
pub use glsseries::gls_series;
pub use linalg::block_diag;
pub use logsumexp::logcosh;
pub use logsumexp::logsumexp;
pub use lowess::loess_fit;
pub use lowess::weighted_lowess;
pub use lowess::LoessFit;
pub use lowess::WeightedLowess;
pub use ma3x3::ma3x3_matrix;
pub use ma3x3::ma3x3_spottedarray;
pub use ma3x3::Ma3x3Fun;
pub use modelmatrix::model_matrix;
pub use modelmatrix::unique_targets;
pub use modelmatrix::ModelMatrix;
pub use modelmatrix::ModelParam;
pub use mrlm::mrlm;
pub use neqc::nec;
pub use neqc::neqc;
pub use neqc::normexp_fit_control;
pub use neqc::normexp_fit_detection_p;
pub use norm::normalize_between_arrays;
pub use norm::normalize_cyclic_loess;
pub use norm::normalize_median_abs_values;
pub use norm::normalize_median_values;
pub use norm::normalize_quantiles;
pub use norm::CyclicMethod;
pub use norm::NormalizeMethod;
pub use normexp::background_correct_matrix;
pub use normexp::normexp_fit_saddle;
pub use normexp::normexp_signal;
pub use normexp::BackgroundMethod;
pub use normexp::NormexpFit;
pub use normwithin::ma_from_rg;
pub use normwithin::normalize_within_arrays;
pub use normwithin::rg_from_ma;
pub use normwithin::PrinterLayout;
pub use normwithin::WithinArrayMethod;
pub use optim::nelder_mead;
pub use optim::nelder_mead_with;
pub use optim::NelderMead;
pub use poolvar::pool_var;
pub use poolvar::PoolVar;
pub use predfcm::pred_fcm;
pub use printtipweights::printtip_weights;
pub use propexpr::propexpr;
pub use proptruenull::convest;
pub use proptruenull::prop_true_null;
pub use proptruenull::PropTrueNullMethod;
pub use qqt::qqf;
pub use qqt::qqt;
pub use rng::qnorm;
pub use rng::RRng;
pub use selectmodel::select_model;
pub use selectmodel::SelectCriterion;
pub use selectmodel::SelectModelResult;
pub use sepchannel::design_i2a;
pub use sepchannel::design_i2m;
pub use sepchannel::exprs_ma;
pub use sepchannel::lmsc_fit;
pub use toptable::p_adjust_bh;
pub use toptable::top_table;
pub use toptable::top_table_f;
pub use toptable::top_treat;
pub use toptable::SortBy;
pub use toptable::TopRow;
pub use toptable::TopRowF;
pub use treat::treat;
pub use tricube::tricube_moving_average;
pub use voom::choose_lowess_span;
pub use voom::voom;
pub use voom::voom_with_quality_weights;
pub use voom::vooma;
pub use voom::vooma_by_group;
pub use voom::vooma_lm_fit;
pub use voom::VoomOutput;
pub use voom::VoomQualityWeights;
pub use voom::VoomaByGroupOutput;
pub use voom::VoomaLmFit;
pub use voom::VoomaOutput;
pub use weightedmedian::weighted_median;
pub use weights::as_matrix_weights;
pub use weights::modify_weights;
pub use wsva::wsva;
pub use zscore::t_zscore;
pub use zscore::zscore_from_log_tails;
pub use zscore::zscore_gamma;
pub use zscore::zscore_t;
pub use zscore::ZscoreTMethod;
pub use zscorehyper::zscore_hyper;

Modules§

arrayweights: arrayWeights — estimate array quality weights.
auroc: Area under the ROC curve for empirical data.
avereps: Averaging over replicate probes (avereps) or replicate arrays (avearrays).
batch: Remove batch effects from an expression matrix. Port of limma’s removeBatchEffect (removeBatchEffect.R).
beadcountweights: beadCountWeights (beadCountWeights.R): bead-count quality weights for Illumina BeadChips. Each probe’s variance is split into a technical component predicted from the bead-level coefficient of variation and bead counts, and a biological component recovered from the residual variance after fitting the design. The returned weights are 1 / (tech + bio), row-normalised to mean 1.
bwss: Between- and within-group sums of squares.
classifytestsf: Nested F-test classification of t-statistics (limma classifyTestsF).
combine: Make a string vector unique (limma combine.R makeUnique).
contrasts: Contrasts of a fitted linear model. Ports limma’s contrasts.fit (contrasts.R, numeric-matrix path) and makeContrasts (modelmatrix.R).
cumoverlap: Cumulative overlap analysis of two ranked ID lists.
decidetests: Multiple-testing decisions. Port of limma’s decideTests (decidetests.R): the separate, global, hierarchical and nestedF methods, plus the p.adjust family limma uses (none, bonferroni, holm, BH/fdr, BY) and the row-wise .classifyTestsP classifier (classify_tests_p). For separate/global, NA p-values (missing-value data) are treated as NotSig rather than R’s NA outcome; nestedF errors on any NA F.p.value, as in limma. Complete-data inputs match R exactly.
detectionpvalues: Detection p-values from negative controls (limma detectionPValues).
diffsplice: Differential exon usage. Port of limma’s diffSplice and topSplice (diffSplice.R, topSplice.R).
dups: duplicateCorrelation (dups.R) and its helpers. Estimates the intra-block (or intra-duplicate) correlation of a series of arrays via REML, reproducing limma’s call into statmod’s mixedModel2Fit(only.varcomp=TRUE) and glmgam.fit. Only the weights = NULL path is implemented; weighted correlation estimation is out of scope for this pure-Rust port.
ebayes: Empirical Bayes moderation. Port of limma’s eBayes/.ebayes (ebayes.R), squeezeVar (squeezeVar.R), fitFDist (fitFDist.R), fitFDistRobustly (fitFDistRobustly.R) and the fstat.only path of classifyTestsF (decidetests.R). trend and robust are independently selectable, including their combination (trended-robust via loessFit).
fit: Gene-wise linear model fitting. Port of limma’s lmFit (lm.series / nonEstimable), least-squares path only.
fitgamma: Intercept of an additive Gamma GLM by Newton iteration.
fitmixture: Mixture-model fit by nonlinear least squares.
genas: Genuine association of gene expression profiles. Port of limma’s genas (genas.R).
geneset: Competitive gene-set tests: the deterministic, rank-based members of limma’s gene-set family.
glsseries: gls.series (lmfit.R): gene-wise generalized least squares allowing for a known correlation between duplicate spots (ndups) or between samples that share a block. This is the fitting engine lmFit dispatches to whenever ndups > 1 or block is supplied together with a correlation (typically the consensus.correlation from crate::duplicate_correlation).
io: Input/output for expression matrices, design and contrast matrices, and result tables. The delimited-text reader (read_matrix) lives behind the cli feature – it is the only consumer of the csv crate; everything else here (the design/contrast aligners and every table writer, including write_fit) is std-only and always compiled.
linalg: Small dense linear-algebra routines implemented in pure Rust so the crate has no BLAS/LAPACK system dependency. Designed matrices in limma are n_samples x n_coef with very few columns, so simple O(n p^2) algorithms are more than fast enough.
logsumexp: Overflow-safe log(cosh(x)) and log(exp(x) + exp(y)).
lowess: Cleveland’s LOWESS scatterplot smoother: a faithful port of R’s lowess() / clowess (src/library/stats/src/lowess.c) and the weights = NULL classic path of limma’s loessFit (loessFit.R), together with the prior-weight-aware weighted_lowess (weightedLowess.R + src/weighted_lowess.c) used by the weighted path of loessFit.
ma3x3: 3x3 moving-window filters (limma background.R: ma3x3.matrix, ma3x3.spottedarray). These underlie backgroundCorrect(method = "movingmin"), replacing each spot’s value by a summary (typically min) over its 3x3 spatial neighbourhood. Border neighbours are treated as missing and dropped (na.rm = TRUE), as is any non-finite neighbour.
modelmatrix: Port of limma’s modelMatrix / uniqueTargets (modelmatrix.R): construct the design matrix for a two-color (Cy3/Cy5) microarray experiment from a table of target names. Each array contributes a log-ratio Cy5 − Cy3, and the design expresses those log-ratios in terms of either a common-reference parametrization or a caller-supplied parameter matrix.
mrlm: mrlm (lmfit.R): robust gene-wise regression, the engine lmFit(method = "robust") dispatches to. Each gene is fit by MASS::rlm’s iteratively reweighted least squares (IRLS), reproducing MASS 7.3.65’s rlm.default defaults exactly: init = "ls", psi = psi.huber (k = 1.345), scale.est = "MAD", wt.method = "inv.var", maxit = 20, acc = 1e-4, test.vec = "resid".
neqc: neqc.R: normexp background correction for Illumina BeadArray data.
norm: Between-array normalization of single-channel matrices. Port of limma’s normalizeBetweenArrays matrix path and its constituents normalizeQuantiles (quantile), normalizeMedianValues (scale) and normalizeCyclicLoess (cyclicloess). Two-colour (RGList/MAList) and vsn methods are out of scope for the pure-Rust statistical port.
normexp: normexp convolution model and background correction.
normwithin: Two-colour within-array normalization. Port of limma’s MA.RG (RGList -> MAList), RG.MA (the inverse), and normalizeWithinArrays for the deterministic intensity-dependent methods none/median/loess/ printtiploess. The composite/control methods (which call R’s stats::loess restricted to control spots) and robustspline (normalizeRobustSpline) are out of scope for this pure-Rust statistical port.
optim: Nelder–Mead simplex minimizer, a faithful port of R’s nmmin (src/appl/optim.c), which backs optim(method = "Nelder-Mead").
poolvar: Satterthwaite pooling of sample variances.
predfcm: Predictive (empirical-Bayes-shrunk) fold changes. Pure-Rust port of limma’s predFCm (Phipson & Smyth).
printtipweights: printtipWeights (printtipWeights.R): print-tip array quality weights for two-colour arrays. Each print-tip block (a contiguous run of nspots rows) gets its own array-weight estimate via the gene-by-gene update algorithm — the same machinery as arrayWeights(method = "genebygene") but with a contr.sum variance design and a per-block prior — and the resulting narrays weights are broadcast across the block’s rows.
propexpr: Proportion of expressed probes per array (Shi & Smyth).
proptruenull: Proportion of true null hypotheses.
qqt: Quantile-quantile plot points for Student’s t and F distributions, ports of limma’s qqt and qqf. Only the numeric core is ported — the plotting arguments (plot.it, main, xlab, …) are out of scope — so each function returns (x, y), where x are the theoretical quantiles ordered to match the ranks of y and y is the input with NaNs removed.
rng: Bit-exact port of R’s default random number generator: the Mersenne-Twister uniform generator with R’s set.seed initialisation, plus the Inversion normal generator. This reproduces set.seed, runif and rnorm exactly, which the rotation gene-set tests (roast, romer) depend on for reproducible p-values.
selectmodel: Model selection by information criterion (limma selectModel, selmod.R).
sepchannel: Separate-channel analysis (limma sepchannel.R).
special: Special functions and distribution helpers.
splines: Natural cubic spline basis, a port of splines::ns(x, df, intercept=TRUE) (R’s splines package) and the B-spline machinery (splineDesign / de Boor) it needs. Used by the trended variance moderation in eBayes (fitFDist with a covariate).
toptable: Ranked result tables. Port of limma’s topTable (topTable / topTableF) plus the Benjamini-Hochberg adjustment provided by R’s p.adjust(method = "BH").
treat: Moderated t-tests relative to a fold-change threshold. Port of limma’s treat/topTreat (treat.R), default path only: trend=FALSE, robust=FALSE, upshot=FALSE. Trend/robust variants depend on the trended and robust squeezeVar paths, which are not yet ported.
tricube: Tricube-weighted moving average.
voom: Mean-variance modelling of count data at the observation level. Port of limma’s voom (voom.R) and the span heuristic chooseLowessSpan.
weightedmedian: Weighted median.
weights: Weight utilities (limma weights.R).
wsva: Weighted surrogate variable analysis. Port of limma’s wsva (wsva.R).
zscore: z-score equivalents of t-distribution deviates.
zscorehyper: Mid-p z-score equivalents of hypergeometric deviates.

Crate limma

Crate limma Copy item path

Re-exports§

Modules§

Crate limma