[−][src]Crate needletail
Needletail is a crate to quickly and easily parse FASTA and FASTQ sequences out of streams/files and manipulate and analyse that data.
A contrived example of how to use it:
extern crate needletail; use needletail::{parse_fastx_file, Sequence, FastxReader}; fn main() { let filename = "tests/data/28S.fasta"; let mut n_bases = 0; let mut n_valid_kmers = 0; let mut reader = parse_fastx_file(&filename).expect("valid path/file"); while let Some(record) = reader.next() { let seqrec = record.expect("invalid record"); // keep track of the total number of bases n_bases += seqrec.num_bases(); // normalize to make sure all the bases are consistently capitalized and // that we remove the newlines since this is FASTA let norm_seq = seqrec.normalize(false); // we make a reverse complemented copy of the sequence first for // `canonical_kmers` to draw the complemented sequences from. let rc = norm_seq.reverse_complement(); // now we keep track of the number of AAAAs (or TTTTs via // canonicalization) in the file; note we also get the position (i.0; // in the event there were `N`-containing kmers that were skipped) // and whether the sequence was complemented (i.2) in addition to // the canonical kmer (i.1) for (_, kmer, _) in norm_seq.canonical_kmers(4, &rc) { if kmer == b"AAAA" { n_valid_kmers += 1; } } } println!("There are {} bases in your file.", n_bases); println!("There are {} AAAAs in your file.", n_valid_kmers); }
Re-exports
pub use parser::parse_fastx_file; |
pub use parser::parse_fastx_reader; |
pub use parser::parse_fastx_stdin; |
pub use sequence::Sequence; |
Modules
bitkmer | Compact binary representations of nucleic acid kmers |
errors | The errors needletail can return; only when parsing FASTA/FASTQ files |
kmer | Functions for splitting sequences into fixed-width moving windows (kmers) and utilities for dealing with these kmers. |
parser | Handles all the FASTA/FASTQ parsing |
sequence | Generic functions for working with (primarily nucleic acid) sequences |
Traits
FastxReader | The main trait, iterator-like, that the FASTA and FASTQ readers implement |