Skip to main content

Module glsseries

Module glsseries 

Source
Expand description

gls.series (lmfit.R): gene-wise generalized least squares allowing for a known correlation between duplicate spots (ndups) or between samples that share a block. This is the fitting engine lmFit dispatches to whenever ndups > 1 or block is supplied together with a correlation (typically the consensus.correlation from crate::duplicate_correlation).

Two cormatrix constructions are reproduced:

  • duplicatescormatrix = diag(correlation, narrays) ⊗ J(ndups) with unit diagonal, after unwrapdups folds the ndups spots of each gene into extra columns and the design is replicated row-wise;
  • blockscormatrix[i,j] = correlation when block[i] == block[j], unit diagonal.

And both of limma’s code paths:

  • the fast multi-response lm.fit when every value is finite and no probe weights are supplied (one shared chol(V) and design QR);
  • the slow per-gene iteration when probe weights or missing values are present (each gene re-derives V over its observed arrays).

cov.coefficients is always the unweighted (Xᵀ V⁻¹ X)⁻¹ of the full design (limma computes it from chol(cormatrix) even on the weighted path).

Not reproduced (rare, documented): the array-weight fast path that scales V by 1/sqrt(arrayweights) — array weights are uncommon with gls.series, and a full probe-weight matrix flows correctly through the slow path instead; and per-gene rank deficiency (an observed-array subset smaller than the number of coefficients), which limma resolves by pivoted column dropping.

Functions§

gls_series
Fit gene-wise linear models by generalized least squares. Port of limma’s gls.series. correlation must be supplied (limma would otherwise call duplicateCorrelation); pass crate::duplicate_correlation’s consensus.