[−][src]Module seq_io::fastq
FASTQ reading and writing
Flavours
There are two flavours of this parser:
fastq::Reader
in this module parses standard single-line FASTQ.Reader
infastq::multiline
parses multi-line FASTQ as well. This parser runs slightly slower on single-line FASTQ thanfastq::Reader
.
Example
The following example shows how to use Reader
.
use seq_io::prelude::*; // needed to import necessary traits use seq_io::fastq::Reader; let seq = b"@id1 some description SEQUENCE + IIIIIIII @id2 SEQUENCE + IIIIIIII "; // Construct the reader let mut reader = Reader::new(&seq[..]); // We'll write the records back to this vector let mut output = vec![]; while let Some(result) = reader.next() { let rec = result.unwrap(); // Access the ID and the description parts of the header (separated by a space) let id = rec.id().unwrap(); let desc = rec.desc().transpose().unwrap(); println!("ID: {}, description: {:?}", id, desc); // Print the sequence and quality scores println!("seq: {}", std::str::from_utf8(rec.seq()).unwrap()); println!("qual: {}", std::str::from_utf8(rec.qual()).unwrap()); // Write the record to 'output' rec.write(&mut output).unwrap(); } // The output is identical assert_eq!(&seq[..], output.as_slice());
The output will be:
ID: id1, description: Some("some description")
seq: SEQUENCE
qual: IIIIIIII
ID: id2, description: None
seq: SEQUENCE
qual: IIIIIIII
As the record returned by the next()
method borrows its data from the underlying buffer, it is not possible to
use a for
loop for iterating. Therefore, we use the while let ...
construct.
Sequence record types
Similarly to fasta
)::Reader
, there are two record types,
which both implement the common BaseRecord
trait, and fastq::Record
providing additional FASTQ
specific methods:
RefRecord
, the type returned byReader::next()
, only remembers the position of the record in the buffer without copying any data.OwnedRecord
owns its data.
Writing FASTQ
Records can be written to output using
BaseRecord::write()
.
RefRecord
additionally has the method
write_unchanged
, which may be
faster.
It is also possible to write data not part of a FASTQ record directly using a set of different functions listed here.
Details on parsing and writing
- Like all parsers in this crate,
fasta::Reader
handles UNIX (LF) and Windows (CRLF) line endings, but not old Mac-style (CR) endings. LF and CRLF may be mixed within the same file. - FASTQ writing currently always uses UNIX line endings.
- The first non-empty line should start with
@
, indicating the first header. If not, an error withErrorKind::InvalidStart
is returned. - Empty lines are allowed before and after records, but not within records.
- Whitespace at the end of header and sequence lines is never removed.
- Empty input will result in
None
being returned immediately byReader::next()
and in empty iterators forRecordsIter
/RecordsIntoIter
. Reader::next()
compares sequence and quality line lengths and returns an error ofErrorKind::UnequalLengths
if different. It is possible to omit this check by callingReader::next_unchecked_len()
. The lengths can be checked later withRecord::check_lengths()
orRefRecord::check_lengths_strict()
- The quality line of the last record should either terminated by a line ending.
If not, an error of
ErrorKind::UnexpectedEnd
is returned.
Error priority
Validity checks are done in the following order:
- Is the start byte correct (
@
)? If not:InvalidStart
. - Do the the header, sequence and separator lines have a line terminator?
If not:
UnexpectedEnd
. - Is the separator byte correct (
+
)? If not:InvalidSep
. - Are the quality scores have a line terminator or are they
non-empty? If not:
UnexpectedEnd
.
Modules
multiline | FASTQ reading with multi-line FASTQ support. |
Structs
Error | Parsing error |
OwnedRecord | A FASTQ record that ownes its data (requires allocations) |
RangeStore | |
Reader | FASTQ parser |
RecordSet | Set of sequence records that owns it's buffer and knows the positions of each record. |
RecordSetIter | Iterator over record sets |
RecordsIntoIter | Iterator of |
RecordsIter | Borrowed iterator of |
RefRecord | A FASTQ record that borrows data from a buffer
It implements the traits |
Enums
ErrorKind |
Traits
Record | FASTQ record trait implemented by both |
Functions
write | Helper function for writing data (not necessarily stored in a |
write_iter | Helper function for writing data (not necessarily stored in a |
Type Definitions
Result |