Expand description
Annotate splice junctions from spliced BAM reads against a BED12 gene model.
Mirrors RSeQC junction_annotation.py (LGPL-2.1+):
- extracts every
N-op intron from each read’s CIGAR string; - filters by minimum intron length and MAPQ;
- classifies each
(chrom, intron_start, intron_end)as known (both splice sites present in the BED12 intron set),partial_novel(one site known), orcomplete_novel(neither site known); - counts both per-read events and distinct junctions.
§Origin
This crate is an independent Rust reimplementation based on:
RSeQC:junction_annotation.py(LGPL-2.1+), Wang et al. 2012 https://doi.org/10.1093/bioinformatics/bts356- The SAM/BAM format specification (MIT)
- BED12 format specification
- Black-box behaviour testing against
RSeQC5.0.4
No source code from the GPL/LGPL upstream was used as reference during implementation; the algorithm is derived from the published method, the public format specs, and black-box behavioural testing.
License: MIT OR Apache-2.0.
Upstream credit: RSeQC https://rseqc.sourceforge.net/ (LGPL-2.1+).
Structs§
- Junction
Counts - Summary counts produced by junction annotation.
- Known
Sites - Known splice-site sets built from the BED12 intron boundaries.
Enums§
- Junction
Class - Classification of a splice junction relative to the gene model.
Functions§
- annotate_
junctions - Annotate splice junctions from
bam_pathagainst the BED12 gene model atbed_path.