Expand description
Infer RNA-seq library strand protocol from a BAM file and a BED12 gene model.
Mirrors the algorithm of RSeQC infer_experiment.py (LGPL):
- reads up to
sample_sizemapped, non-duplicate, non-secondary, non-QC-fail reads with MAPQ ≥mapq_cut; - for each read, finds overlapping genes in the BED12 model;
- classifies the read by
(read_id, map_strand, gene_strand); - emits forward (sp1), reverse (sp2), and undetermined fractions.
§Origin
This crate is an independent Rust reimplementation based on:
RSeQC:infer_experiment.py(LGPL-2.1+), Wang et al. 2012 https://doi.org/10.1093/bioinformatics/bts356- The SAM/BAM format specification (MIT)
- BED12 format specification
- Black-box behaviour testing against
RSeQC5.0.4
License: MIT OR Apache-2.0.
Upstream credit: RSeQC https://rseqc.sourceforge.net/ (LGPL-2.1+).
Structs§
- Gene
Index - Per-chromosome interval tree, mapping genomic positions → gene strands.
- Strandedness
Result - Result of strandedness inference.
Enums§
- Gene
Strand - Strand of a gene (‘+’ or ‘-’).
- Protocol
- Whether the BAM contains paired-end or single-end reads.
Functions§
- infer_
strandedness - Infer strandedness from
bam_pathusing the gene model atbed_path.