inferCNAsc
Copy number alteration (CNA) inference from single-cell RNA-seq data. A Rust core with optional Python bindings via PyO3.
Python
# From an AnnData object (Ensembl lookup runs automatically when needed)
=
# From raw arrays (no AnnData dependency required)
=
= # DataFrame of detected CNA regions
# per-cell CNA heatmap
gene_df is a DataFrame with columns gene, chrom, start, end.
Use infercnasc.io.annotate_genes(gene_ids) to fetch these from Ensembl.
Rust
[]
= "0.1"
No feature flags are needed for the native Rust API.
use ;
let smoothed = smooth_expression?;
let = find_cnas;
let cnas = assign_cnas_to_cells;
smooth_expression returns Result<Array2<f64>, InferError>.
find_cnas and assign_cnas_to_cells are infallible.
How it works
- Gene annotation: gene IDs are mapped to genomic coordinates via the
Ensembl REST API (results cached locally with
requests-cache). - Smoothing: a sliding-window mean is applied across neighboring genes within each chromosome. The window resets at chromosome boundaries.
- CNA calling: z-scores are computed per gene across cells. Genes above the threshold are flagged as gains; genes below are flagged as losses.
- Region assembly: consecutive flagged genes on the same chromosome are merged into CNA regions using a run-length scan.
Authors
Alejandro J. Soto Franco (primary author)
The algorithm design and original Python prototype (v0.2) were co-developed with Raeann Kalinowski and Amy Liu as a final project for 580.447 Computational Stem Cell Biology, Spring 2025, Johns Hopkins University Department of Biomedical Engineering. This crate is a full independent rewrite and is no longer associated with that course or developed for academic submission purposes.
Citation
Soto Franco A.J., Kalinowski R., Liu A. inferCNAsc: a Python toolkit for copy number inference from single-cell transcriptomes. In preparation (2025).
License
MIT