Skip to main content

Crate limma

Crate limma 

Source
Expand description

limma — a pure-Rust port of the Bioconductor limma package (Linear Models for Microarray and RNA-seq Data), validated function by function against limma 3.68.3 running on R 4.6.0.

The crate has no BLAS/LAPACK, Python or R dependency at runtime: all linear algebra and special functions are implemented in pure Rust (see linalg and special), with distribution functions from statrs.

The current scope is the statistical core of limma — lmFit (least squares) -> contrasts.fit -> eBayes -> topTable/topTableF — with further functions ported incrementally. See README.md for the per-function status table. Microarray IO readers, plotting and annotation helpers are out of scope for a pure-Rust port.

Re-exports§

pub use arrayweights::array_weights;
pub use arrayweights::array_weights_gene_by_gene;
pub use arrayweights::array_weights_prwts_reml;
pub use arrayweights::array_weights_quick;
pub use auroc::auroc;
pub use avereps::avearrays;
pub use avereps::avereps;
pub use batch::remove_batch_effect;
pub use beadcountweights::bead_count_weights;
pub use beadcountweights::BeadCountWeights;
pub use beadcountweights::BeadDispersion;
pub use bwss::bwss;
pub use bwss::bwss_matrix;
pub use bwss::Bwss;
pub use classifytestsf::classify_tests_f;
pub use classifytestsf::classify_tests_fstat;
pub use combine::make_unique;
pub use contrasts::contrasts_fit;
pub use contrasts::make_contrasts;
pub use cumoverlap::cum_overlap;
pub use cumoverlap::CumOverlap;
pub use decidetests::classify_tests_p;
pub use decidetests::decide_tests;
pub use decidetests::decide_tests_pvalues;
pub use decidetests::p_adjust;
pub use decidetests::Adjust;
pub use decidetests::DecideMethod;
pub use decidetests::TestResults;
pub use decidetests::TestResultsSummary;
pub use detectionpvalues::detection_p_values;
pub use diffsplice::diff_splice;
pub use diffsplice::top_splice;
pub use diffsplice::DiffSplice;
pub use diffsplice::SpliceSort;
pub use diffsplice::SpliceTest;
pub use diffsplice::TopSpliceRow;
pub use dups::avedups;
pub use dups::duplicate_correlation;
pub use dups::uniquegenelist;
pub use dups::unwrapdups;
pub use dups::DupCorOutput;
pub use ebayes::ebayes;
pub use fit::is_fullrank;
pub use fit::lmfit;
pub use fit::lmfit_weighted;
pub use fit::non_estimable;
pub use fit::MArrayLM;
pub use fitgamma::fit_gamma_intercept;
pub use fitmixture::fitmixture;
pub use fitmixture::FitMixture;
pub use genas::genas;
pub use genas::Genas;
pub use genas::GenasSubset;
pub use geneset::camera;
pub use geneset::camera_pr;
pub use geneset::contrast_as_coef;
pub use geneset::fry;
pub use geneset::gene_set_test;
pub use geneset::ids2indices;
pub use geneset::inter_gene_correlation;
pub use geneset::mroast;
pub use geneset::rank_sum_test_with_correlation;
pub use geneset::roast;
pub use geneset::romer;
pub use geneset::top_romer;
pub use geneset::wilcox_gst;
pub use geneset::Alternative;
pub use geneset::CameraResult;
pub use geneset::ContrastAsCoef;
pub use geneset::Direction;
pub use geneset::FryResult;
pub use geneset::FrySort;
pub use geneset::MroastRow;
pub use geneset::Roast;
pub use geneset::RomerAlternative;
pub use geneset::RomerRow;
pub use geneset::RomerStatistic;
pub use glsseries::gls_series;
pub use linalg::block_diag;
pub use logsumexp::logcosh;
pub use logsumexp::logsumexp;
pub use lowess::loess_fit;
pub use lowess::weighted_lowess;
pub use lowess::LoessFit;
pub use lowess::WeightedLowess;
pub use ma3x3::ma3x3_matrix;
pub use ma3x3::ma3x3_spottedarray;
pub use ma3x3::Ma3x3Fun;
pub use modelmatrix::model_matrix;
pub use modelmatrix::unique_targets;
pub use modelmatrix::ModelMatrix;
pub use modelmatrix::ModelParam;
pub use mrlm::mrlm;
pub use neqc::nec;
pub use neqc::neqc;
pub use neqc::normexp_fit_control;
pub use neqc::normexp_fit_detection_p;
pub use norm::normalize_between_arrays;
pub use norm::normalize_cyclic_loess;
pub use norm::normalize_median_abs_values;
pub use norm::normalize_median_values;
pub use norm::normalize_quantiles;
pub use norm::CyclicMethod;
pub use norm::NormalizeMethod;
pub use normexp::background_correct_matrix;
pub use normexp::normexp_fit_saddle;
pub use normexp::normexp_signal;
pub use normexp::BackgroundMethod;
pub use normexp::NormexpFit;
pub use normwithin::ma_from_rg;
pub use normwithin::normalize_within_arrays;
pub use normwithin::rg_from_ma;
pub use normwithin::PrinterLayout;
pub use normwithin::WithinArrayMethod;
pub use optim::nelder_mead;
pub use optim::nelder_mead_with;
pub use optim::NelderMead;
pub use poolvar::pool_var;
pub use poolvar::PoolVar;
pub use predfcm::pred_fcm;
pub use printtipweights::printtip_weights;
pub use propexpr::propexpr;
pub use proptruenull::convest;
pub use proptruenull::prop_true_null;
pub use proptruenull::PropTrueNullMethod;
pub use qqt::qqf;
pub use qqt::qqt;
pub use rng::qnorm;
pub use rng::RRng;
pub use selectmodel::select_model;
pub use selectmodel::SelectCriterion;
pub use selectmodel::SelectModelResult;
pub use sepchannel::design_i2a;
pub use sepchannel::design_i2m;
pub use sepchannel::exprs_ma;
pub use sepchannel::lmsc_fit;
pub use toptable::p_adjust_bh;
pub use toptable::top_table;
pub use toptable::top_table_f;
pub use toptable::top_treat;
pub use toptable::SortBy;
pub use toptable::TopRow;
pub use toptable::TopRowF;
pub use treat::treat;
pub use tricube::tricube_moving_average;
pub use voom::choose_lowess_span;
pub use voom::voom;
pub use voom::voom_with_quality_weights;
pub use voom::vooma;
pub use voom::vooma_by_group;
pub use voom::vooma_lm_fit;
pub use voom::VoomOutput;
pub use voom::VoomQualityWeights;
pub use voom::VoomaByGroupOutput;
pub use voom::VoomaLmFit;
pub use voom::VoomaOutput;
pub use weightedmedian::weighted_median;
pub use weights::as_matrix_weights;
pub use weights::modify_weights;
pub use wsva::wsva;
pub use zscore::t_zscore;
pub use zscore::zscore_from_log_tails;
pub use zscore::zscore_gamma;
pub use zscore::zscore_t;
pub use zscore::ZscoreTMethod;
pub use zscorehyper::zscore_hyper;

Modules§

arrayweights
arrayWeights — estimate array quality weights.
auroc
Area under the ROC curve for empirical data.
avereps
Averaging over replicate probes (avereps) or replicate arrays (avearrays).
batch
Remove batch effects from an expression matrix. Port of limma’s removeBatchEffect (removeBatchEffect.R).
beadcountweights
beadCountWeights (beadCountWeights.R): bead-count quality weights for Illumina BeadChips. Each probe’s variance is split into a technical component predicted from the bead-level coefficient of variation and bead counts, and a biological component recovered from the residual variance after fitting the design. The returned weights are 1 / (tech + bio), row-normalised to mean 1.
bwss
Between- and within-group sums of squares.
classifytestsf
Nested F-test classification of t-statistics (limma classifyTestsF).
combine
Make a string vector unique (limma combine.R makeUnique).
contrasts
Contrasts of a fitted linear model. Ports limma’s contrasts.fit (contrasts.R, numeric-matrix path) and makeContrasts (modelmatrix.R).
cumoverlap
Cumulative overlap analysis of two ranked ID lists.
decidetests
Multiple-testing decisions. Port of limma’s decideTests (decidetests.R): the separate, global, hierarchical and nestedF methods, plus the p.adjust family limma uses (none, bonferroni, holm, BH/fdr, BY) and the row-wise .classifyTestsP classifier (classify_tests_p). For separate/global, NA p-values (missing-value data) are treated as NotSig rather than R’s NA outcome; nestedF errors on any NA F.p.value, as in limma. Complete-data inputs match R exactly.
detectionpvalues
Detection p-values from negative controls (limma detectionPValues).
diffsplice
Differential exon usage. Port of limma’s diffSplice and topSplice (diffSplice.R, topSplice.R).
dups
duplicateCorrelation (dups.R) and its helpers. Estimates the intra-block (or intra-duplicate) correlation of a series of arrays via REML, reproducing limma’s call into statmod’s mixedModel2Fit(only.varcomp=TRUE) and glmgam.fit. Only the weights = NULL path is implemented; weighted correlation estimation is out of scope for this pure-Rust port.
ebayes
Empirical Bayes moderation. Port of limma’s eBayes/.ebayes (ebayes.R), squeezeVar (squeezeVar.R), fitFDist (fitFDist.R), fitFDistRobustly (fitFDistRobustly.R) and the fstat.only path of classifyTestsF (decidetests.R). trend and robust are independently selectable, including their combination (trended-robust via loessFit).
fit
Gene-wise linear model fitting. Port of limma’s lmFit (lm.series / nonEstimable), least-squares path only.
fitgamma
Intercept of an additive Gamma GLM by Newton iteration.
fitmixture
Mixture-model fit by nonlinear least squares.
genas
Genuine association of gene expression profiles. Port of limma’s genas (genas.R).
geneset
Competitive gene-set tests: the deterministic, rank-based members of limma’s gene-set family.
glsseries
gls.series (lmfit.R): gene-wise generalized least squares allowing for a known correlation between duplicate spots (ndups) or between samples that share a block. This is the fitting engine lmFit dispatches to whenever ndups > 1 or block is supplied together with a correlation (typically the consensus.correlation from crate::duplicate_correlation).
io
Input/output for expression matrices, design and contrast matrices, and result tables. The delimited-text reader (read_matrix) lives behind the cli feature – it is the only consumer of the csv crate; everything else here (the design/contrast aligners and every table writer, including write_fit) is std-only and always compiled.
linalg
Small dense linear-algebra routines implemented in pure Rust so the crate has no BLAS/LAPACK system dependency. Designed matrices in limma are n_samples x n_coef with very few columns, so simple O(n p^2) algorithms are more than fast enough.
logsumexp
Overflow-safe log(cosh(x)) and log(exp(x) + exp(y)).
lowess
Cleveland’s LOWESS scatterplot smoother: a faithful port of R’s lowess() / clowess (src/library/stats/src/lowess.c) and the weights = NULL classic path of limma’s loessFit (loessFit.R), together with the prior-weight-aware weighted_lowess (weightedLowess.R + src/weighted_lowess.c) used by the weighted path of loessFit.
ma3x3
3x3 moving-window filters (limma background.R: ma3x3.matrix, ma3x3.spottedarray). These underlie backgroundCorrect(method = "movingmin"), replacing each spot’s value by a summary (typically min) over its 3x3 spatial neighbourhood. Border neighbours are treated as missing and dropped (na.rm = TRUE), as is any non-finite neighbour.
modelmatrix
Port of limma’s modelMatrix / uniqueTargets (modelmatrix.R): construct the design matrix for a two-color (Cy3/Cy5) microarray experiment from a table of target names. Each array contributes a log-ratio Cy5 − Cy3, and the design expresses those log-ratios in terms of either a common-reference parametrization or a caller-supplied parameter matrix.
mrlm
mrlm (lmfit.R): robust gene-wise regression, the engine lmFit(method = "robust") dispatches to. Each gene is fit by MASS::rlm’s iteratively reweighted least squares (IRLS), reproducing MASS 7.3.65’s rlm.default defaults exactly: init = "ls", psi = psi.huber (k = 1.345), scale.est = "MAD", wt.method = "inv.var", maxit = 20, acc = 1e-4, test.vec = "resid".
neqc
neqc.R: normexp background correction for Illumina BeadArray data.
norm
Between-array normalization of single-channel matrices. Port of limma’s normalizeBetweenArrays matrix path and its constituents normalizeQuantiles (quantile), normalizeMedianValues (scale) and normalizeCyclicLoess (cyclicloess). Two-colour (RGList/MAList) and vsn methods are out of scope for the pure-Rust statistical port.
normexp
normexp convolution model and background correction.
normwithin
Two-colour within-array normalization. Port of limma’s MA.RG (RGList -> MAList), RG.MA (the inverse), and normalizeWithinArrays for the deterministic intensity-dependent methods none/median/loess/ printtiploess. The composite/control methods (which call R’s stats::loess restricted to control spots) and robustspline (normalizeRobustSpline) are out of scope for this pure-Rust statistical port.
optim
Nelder–Mead simplex minimizer, a faithful port of R’s nmmin (src/appl/optim.c), which backs optim(method = "Nelder-Mead").
poolvar
Satterthwaite pooling of sample variances.
predfcm
Predictive (empirical-Bayes-shrunk) fold changes. Pure-Rust port of limma’s predFCm (Phipson & Smyth).
printtipweights
printtipWeights (printtipWeights.R): print-tip array quality weights for two-colour arrays. Each print-tip block (a contiguous run of nspots rows) gets its own array-weight estimate via the gene-by-gene update algorithm — the same machinery as arrayWeights(method = "genebygene") but with a contr.sum variance design and a per-block prior — and the resulting narrays weights are broadcast across the block’s rows.
propexpr
Proportion of expressed probes per array (Shi & Smyth).
proptruenull
Proportion of true null hypotheses.
qqt
Quantile-quantile plot points for Student’s t and F distributions, ports of limma’s qqt and qqf. Only the numeric core is ported — the plotting arguments (plot.it, main, xlab, …) are out of scope — so each function returns (x, y), where x are the theoretical quantiles ordered to match the ranks of y and y is the input with NaNs removed.
rng
Bit-exact port of R’s default random number generator: the Mersenne-Twister uniform generator with R’s set.seed initialisation, plus the Inversion normal generator. This reproduces set.seed, runif and rnorm exactly, which the rotation gene-set tests (roast, romer) depend on for reproducible p-values.
selectmodel
Model selection by information criterion (limma selectModel, selmod.R).
sepchannel
Separate-channel analysis (limma sepchannel.R).
special
Special functions and distribution helpers.
splines
Natural cubic spline basis, a port of splines::ns(x, df, intercept=TRUE) (R’s splines package) and the B-spline machinery (splineDesign / de Boor) it needs. Used by the trended variance moderation in eBayes (fitFDist with a covariate).
toptable
Ranked result tables. Port of limma’s topTable (topTable / topTableF) plus the Benjamini-Hochberg adjustment provided by R’s p.adjust(method = "BH").
treat
Moderated t-tests relative to a fold-change threshold. Port of limma’s treat/topTreat (treat.R), default path only: trend=FALSE, robust=FALSE, upshot=FALSE. Trend/robust variants depend on the trended and robust squeezeVar paths, which are not yet ported.
tricube
Tricube-weighted moving average.
voom
Mean-variance modelling of count data at the observation level. Port of limma’s voom (voom.R) and the span heuristic chooseLowessSpan.
weightedmedian
Weighted median.
weights
Weight utilities (limma weights.R).
wsva
Weighted surrogate variable analysis. Port of limma’s wsva (wsva.R).
zscore
z-score equivalents of t-distribution deviates.
zscorehyper
Mid-p z-score equivalents of hypergeometric deviates.