Crate seq_io [−] [src]
This library provides an(other) attempt at high performance FASTA and FASTQ parsing.
There are many similarities to the excellent fastq-rs.
However, the API that provides streaming iterators where possible.
Additionally, the sequence length of records in the FASTA/FASTQ files
is not limited by the size of the buffer. Instead, the buffer will grow until
the record fits, allowing parsers with a minimum amount of copying required.
How it grows can be configured (see BufStrategy
).
Example FASTQ parser:
This code prints the ID string from each FASTQ record.
use seq_io::fastq::{Reader,Record}; let mut reader = Reader::from_path("seqs.fastq").unwrap(); while let Some(record) = reader.next() { let record = record.expect("Error reading record"); println!("{}", record.id().unwrap()); }
Example FASTA parser calculating mean sequence length:
The FASTA reader works just the same. One challenge with the FASTA
format is that the sequence can be broken into multiple lines.
Therefore, it is not possible to get a slice to the whole sequence
without copying the data. But it is possible to use seq_lines()
for efficiently iterating over each sequence line:
use seq_io::fasta::{Reader,Record}; let mut reader = Reader::from_path("seqs.fasta").unwrap(); let mut n = 0; let mut sum = 0; while let Some(record) = reader.next() { let record = record.expect("Error reading record"); for s in record.seq_lines() { sum += s.len(); } n += 1; } println!("mean sequence length of {} records: {:.1} bp", n, sum as f32 / n as f32);
Parallel processing
Functions for parallel processing can be found in the parallel
module
Modules
fasta |
Efficient FASTA reading and writing |
fastq |
Efficient FASTQ reading and writing |
parallel |
Experiments with parallel processing |
Macros
parallel_record_impl |
Structs
DoubleUntil |
Buffer size doubles until it reaches
|
DoubleUntil8M |
Buffer size doubles until it reaches 8 MB. Above, it will increase in steps of 8 MB |
Traits
BufStrategy |
Strategy that decides how a buffer should grow |