ATGlib
ATGlib is a Rust library to work with genomic transcript data. It handles several file formats, such as GTF, GenePred(ext) and Refgene. You can generate bed files, fasta sequences or custom feature sequences.
If you are looking for an actual application, or a command line tool to work with transcripts, GTF files etc, use ATG instead. It is using ATGlib behind the scenes and provides a simple to use interface.
Documentation
The library API is mostly documented inline and available on docs.rs
Examples
Convert GTF to RefGene
use Reader;
use Writer;
use ;
let mut reader = from_file
.unwrap_or_else;
let mut writer = from_file
.unwrap_or_else;
let transcripts = reader.transcripts
.unwrap_or_else;
match writer.write_transcripts ;
ToDo / Next tasks
- Compare transcripts from two different inputs
- use Smartstring or Smallstr for gene-symbol, transcript name and chromosome
- Parallelize input parsing
- Check if exons can be stored in smaller vec
- Use std::mem::replace to move out of attributes, e.g. in TranscriptBuilder and remove Copy/Clone traits https://stackoverflow.com/questions/31307680/how-to-move-one-field-out-of-a-struct-that-implements-drop-trait
Known issues
GTF parsing
- NM_001371720.1 has two book-ended exons (155160639-155161619 || 155161620-155162101). During input parsing, book-ended features are merged into one exon