cyanea-omics
Data structures for genomics, transcriptomics, variant analysis, single-cell biology, and spatial transcriptomics.
434 unit tests + 2 doc tests.
What's Inside
- Genomic coordinates --
GenomicPosition,GenomicInterval,Strand(0-based half-open) - Interval operations --
IntervalSetwith overlap queries, merging, coverage computation - Interval tree -- augmented BST with O(log n + k) overlap, nearest/preceding/following queries
- Coverage vectors -- RLE-encoded genome-wide depth from intervals
- Expression matrices -- dense features x samples matrix with named rows/columns
- Sparse matrices -- COO-format with CSR conversion for high-dimensional data
- Variant types -- VCF-style
VariantwithVariantType,Zygosity, filters - Variant annotation -- coding consequence prediction (missense/nonsense/frameshift/splice), HGVS notation, SIFT-style scoring
- Gene annotations --
Gene/Transcript/Exonhierarchy with biotypes - AnnData container -- single-cell data with obs/var metadata, obsm/varm embeddings, layers, obsp, uns
- HDF5 I/O -- read/write
.h5adfiles (dense and CSR sparse, metadata, embeddings) - Zarr I/O -- read/write Zarr v3 directories (pure Rust, same feature set as h5ad)
- OTU/ASV tables -- abundance tables with rarefaction, filtering, taxonomic collapse
- Network biology -- weighted graphs, degree/betweenness/closeness centrality, Louvain communities
- Haplotype analysis -- EM phasing, haplotype block detection, diversity statistics
- Genome arithmetic -- intersect, union, subtract, complement, closest, window, Jaccard
- Liftover -- UCSC chain file parsing, coordinate remapping between assemblies
- CNV analysis -- CBS segmentation, B-allele frequency, SV breakpoint detection, segment merging
- Methylation -- bisulfite conversion, CpG site calling, DMRs, CpG island detection
- Spatial transcriptomics -- Delaunay/kNN spatial graphs, Moran's I, Geary's C, co-occurrence, ligand-receptor
- Single-cell pipeline -- normalize, HVG, kNN graph, Leiden/Louvain, diffusion map, DPT, PAGA, RNA velocity, markers, Harmony/ComBat/MNN integration
Quick Start
[]
= { = "0.1", = ["zarr"] }
use ;
let intervals = vec!;
let set = from;
let merged = set.merge;
Feature Flags
| Flag | Default | Description |
|---|---|---|
std |
Yes | Standard library support |
wasm |
No | WASM target marker |
serde |
No | Serialize/Deserialize derives |
h5ad |
No | HDF5 .h5ad I/O (requires system HDF5 1.10.x) |
zarr |
No | Zarr v3 directory I/O (pure Rust) |
single-cell |
No | Single-cell analysis pipeline (HVG, clustering, trajectory, markers, integration) |
Modules
| Module | Description |
|---|---|
genomic |
GenomicPosition, GenomicInterval, Strand |
interval |
IntervalSet with overlap, merge, coverage |
interval_tree |
Augmented BST with O(log n + k) overlap queries |
coverage |
RLE coverage vectors |
expr |
Dense expression matrix |
sparse |
COO sparse matrix with CSR conversion |
variant |
Variant, VariantType, Zygosity |
variant_annotation |
Consequence prediction, HGVS notation, splice scoring |
annotation |
Gene, Transcript, Exon hierarchy |
single_cell |
AnnData container with typed metadata |
h5ad |
HDF5 .h5ad reader/writer (feature-gated) |
zarr |
Zarr v3 reader/writer (feature-gated) |
otu |
OTU/ASV abundance tables |
network |
Weighted graphs, centrality, Louvain |
haplotype |
EM phasing, haplotype blocks |
genome_arithmetic |
Intersect, union, subtract, complement, closest, window, Jaccard |
liftover |
UCSC chain file parsing, coordinate liftover |
cnv |
CBS segmentation, BAF, SV breakpoints |
methylation |
CpG sites, DMRs, CpG islands, bisulfite conversion |
spatial |
Spatial neighbor graphs, Moran's I, Geary's C, co-occurrence, ligand-receptor |
sc_preprocess |
HVG, normalize, regress, doublet detection, gene scoring (feature-gated) |
sc_cluster |
kNN graph, Leiden, Louvain, NMI, ARI (feature-gated) |
sc_trajectory |
Diffusion map, DPT, PAGA, RNA velocity (feature-gated) |
sc_markers |
Marker gene detection: t-test, Wilcoxon, logistic regression (feature-gated) |
sc_integrate |
Harmony, ComBat, MNN, kBET/LISI metrics (feature-gated) |