# rsomics-sc-scale
Per-gene z-score scaling of a single-cell count matrix, numerically matching
scanpy's `sc.pp.scale`.
Each gene is centered and scaled across cells: `z = (x − mean) / std`, where
`mean` and `std` are computed over **all** cells (the implicit zeros of the
sparse matrix included) and the standard deviation uses the **ddof=1** (sample)
convention scanpy enforces. A gene with zero variance keeps `std = 1`, leaving
its centered row at exactly zero.
Scaling **densifies** the matrix: subtracting a nonzero gene mean turns every
implicit zero into `−mean/std`, so the output is a full genes × cells dense
matrix written in MatrixMarket `array` (column-major) layout.
With `--max-value`, the z-scores are symmetrically clipped to
`[−max-value, max-value]` after scaling (scanpy's `zero_center=True` clip).
## Usage
```bash
# scanpy default: zero-center, no clipping
rsomics-sc-scale filtered_feature_bc_matrix/ -o scaled.mtx
# scale and clip to ±10 (a common scanpy idiom)
rsomics-sc-scale mtx_dir/ --max-value 10 -o scaled.mtx
```
Input is a 10x MTX directory (`matrix.mtx` or `matrix.mtx.gz`, genes × cells).
Output is a dense MatrixMarket `array real general` matrix in genes × cells
layout, one value per line in column-major (cell-major) order.
`zero_center=False` (scanpy's optional mode that divides by std without
centering and keeps sparsity) is **not implemented**: the centered z-score is
the routine default, and the uncentered variant is a niche memory optimization
for matrices kept sparse downstream.
## Origin
This crate is an independent Rust reimplementation of scanpy's `sc.pp.scale`
based on:
- The published method (Wolf, Angerer & Theis, "SCANPY: large-scale
single-cell gene expression data analysis", *Genome Biology* 2018,
doi:10.1186/s13059-017-1382-0).
- The public MatrixMarket and 10x Genomics matrix file-format specs.
- Reading scanpy's `_scale.py` / `_utils._get_mean_var` (BSD-3-Clause) to match
the exact std convention (ddof=1), the zero-variance `std=1` rule, and the
symmetric clip semantics.
- Black-box value-level testing against the scanpy Python package.
License: MIT OR Apache-2.0.
Upstream credit: scanpy <https://github.com/scverse/scanpy> (BSD-3-Clause).