# inferCNAsc
[](https://crates.io/crates/infercnasc)
[](https://docs.rs/infercnasc)
[](https://pypi.org/project/infercnasc/)
[](LICENSE)
[](https://github.com/alejandro-soto-franco/inferCNAsc/actions)
[](Cargo.toml)
Copy number alteration (CNA) inference from single-cell RNA-seq data.
A Rust core with optional Python bindings via PyO3.
## Python
```bash
pip install infercnasc
```
```python
from infercnasc import CNAInferrer
import infercnasc.plot as icplot
# From an AnnData object (Ensembl lookup runs automatically when needed)
inferrer = CNAInferrer.from_anndata(adata)
# From raw arrays (no AnnData dependency required)
inferrer = CNAInferrer(window_size=50).fit(expression_matrix, gene_df)
cnas = inferrer.cna_df() # DataFrame of detected CNA regions
icplot.cna_matrix(inferrer) # per-cell CNA heatmap
```
`gene_df` is a DataFrame with columns `gene`, `chrom`, `start`, `end`.
Use `infercnasc.io.annotate_genes(gene_ids)` to fetch these from Ensembl.
## Rust
```toml
[dependencies]
infercnasc = "0.1"
```
No feature flags are needed for the native Rust API.
```rust
use infercnasc::{smooth_expression, find_cnas, assign_cnas_to_cells, InferError};
let smoothed = smooth_expression(&expression, &chroms, window_size)?;
let (gains, losses) = find_cnas(&smoothed, z_score_threshold);
let cnas = assign_cnas_to_cells(
&gains, &losses, &chroms, &starts, &ends, &gene_names, min_region_size,
);
```
`smooth_expression` returns `Result<Array2<f64>, InferError>`.
`find_cnas` and `assign_cnas_to_cells` are infallible.
## How it works
1. **Gene annotation**: gene IDs are mapped to genomic coordinates via the
Ensembl REST API (results cached locally with `requests-cache`).
2. **Smoothing**: a sliding-window mean is applied across neighboring genes
within each chromosome. The window resets at chromosome boundaries.
3. **CNA calling**: z-scores are computed per gene across cells. Genes above
the threshold are flagged as gains; genes below are flagged as losses.
4. **Region assembly**: consecutive flagged genes on the same chromosome are
merged into CNA regions using a run-length scan.
## Authors
**Alejandro J. Soto Franco** (primary author)
The algorithm design and original Python prototype (v0.2) were co-developed with
Raeann Kalinowski and Amy Liu as a final project for 580.447 Computational Stem
Cell Biology, Spring 2025, Johns Hopkins University Department of Biomedical
Engineering. This crate is a full independent rewrite and is no longer associated
with that course or developed for academic submission purposes.
## Citation
> Soto Franco A.J., Kalinowski R., Liu A. *inferCNAsc: a Python toolkit for
> copy number inference from single-cell transcriptomes*. In preparation (2025).
## License
MIT