infercnasc 0.1.1

Copy number alteration inference from scRNA-seq data
Documentation

inferCNAsc

crates.io docs.rs PyPI License: MIT Tests MSRV

Copy number alteration (CNA) inference from single-cell RNA-seq data. A Rust core with optional Python bindings via PyO3.

Python

pip install infercnasc
from infercnasc import CNAInferrer
import infercnasc.plot as icplot

# From an AnnData object (Ensembl lookup runs automatically when needed)
inferrer = CNAInferrer.from_anndata(adata)

# From raw arrays (no AnnData dependency required)
inferrer = CNAInferrer(window_size=50).fit(expression_matrix, gene_df)

cnas = inferrer.cna_df()       # DataFrame of detected CNA regions
icplot.cna_matrix(inferrer)    # per-cell CNA heatmap

gene_df is a DataFrame with columns gene, chrom, start, end. Use infercnasc.io.annotate_genes(gene_ids) to fetch these from Ensembl.

Rust

[dependencies]
infercnasc = "0.1"

No feature flags are needed for the native Rust API.

use infercnasc::{smooth_expression, find_cnas, assign_cnas_to_cells, InferError};

let smoothed = smooth_expression(&expression, &chroms, window_size)?;
let (gains, losses) = find_cnas(&smoothed, z_score_threshold);
let cnas = assign_cnas_to_cells(
    &gains, &losses, &chroms, &starts, &ends, &gene_names, min_region_size,
);

smooth_expression returns Result<Array2<f64>, InferError>. find_cnas and assign_cnas_to_cells are infallible.

How it works

  1. Gene annotation: gene IDs are mapped to genomic coordinates via the Ensembl REST API (results cached locally with requests-cache).
  2. Smoothing: a sliding-window mean is applied across neighboring genes within each chromosome. The window resets at chromosome boundaries.
  3. CNA calling: z-scores are computed per gene across cells. Genes above the threshold are flagged as gains; genes below are flagged as losses.
  4. Region assembly: consecutive flagged genes on the same chromosome are merged into CNA regions using a run-length scan.

Authors

Alejandro J. Soto Franco (primary author)

The algorithm design and original Python prototype (v0.2) were co-developed with Raeann Kalinowski and Amy Liu as a final project for 580.447 Computational Stem Cell Biology, Spring 2025, Johns Hopkins University Department of Biomedical Engineering. This crate is a full independent rewrite and is no longer associated with that course or developed for academic submission purposes.

Citation

Soto Franco A.J., Kalinowski R., Liu A. inferCNAsc: a Python toolkit for copy number inference from single-cell transcriptomes. In preparation (2025).

License

MIT