rsomics-sc-scale 0.1.0

Z-score scaling of a single-cell count matrix — matches scanpy pp.scale (zero_center, ddof=1 std, symmetric clip)
Documentation

rsomics-sc-scale

Per-gene z-score scaling of a single-cell count matrix, numerically matching scanpy's sc.pp.scale.

Each gene is centered and scaled across cells: z = (x − mean) / std, where mean and std are computed over all cells (the implicit zeros of the sparse matrix included) and the standard deviation uses the ddof=1 (sample) convention scanpy enforces. A gene with zero variance keeps std = 1, leaving its centered row at exactly zero.

Scaling densifies the matrix: subtracting a nonzero gene mean turns every implicit zero into −mean/std, so the output is a full genes × cells dense matrix written in MatrixMarket array (column-major) layout.

With --max-value, the z-scores are symmetrically clipped to [−max-value, max-value] after scaling (scanpy's zero_center=True clip).

Usage

# scanpy default: zero-center, no clipping
rsomics-sc-scale filtered_feature_bc_matrix/ -o scaled.mtx

# scale and clip to ±10 (a common scanpy idiom)
rsomics-sc-scale mtx_dir/ --max-value 10 -o scaled.mtx

Input is a 10x MTX directory (matrix.mtx or matrix.mtx.gz, genes × cells). Output is a dense MatrixMarket array real general matrix in genes × cells layout, one value per line in column-major (cell-major) order.

zero_center=False (scanpy's optional mode that divides by std without centering and keeps sparsity) is not implemented: the centered z-score is the routine default, and the uncentered variant is a niche memory optimization for matrices kept sparse downstream.

Origin

This crate is an independent Rust reimplementation of scanpy's sc.pp.scale based on:

  • The published method (Wolf, Angerer & Theis, "SCANPY: large-scale single-cell gene expression data analysis", Genome Biology 2018, doi:10.1186/s13059-017-1382-0).
  • The public MatrixMarket and 10x Genomics matrix file-format specs.
  • Reading scanpy's _scale.py / _utils._get_mean_var (BSD-3-Clause) to match the exact std convention (ddof=1), the zero-variance std=1 rule, and the symmetric clip semantics.
  • Black-box value-level testing against the scanpy Python package.

License: MIT OR Apache-2.0. Upstream credit: scanpy https://github.com/scverse/scanpy (BSD-3-Clause).