Function illumina_coordinates::parse_sequence_identifier
[−]
[src]
pub fn parse_sequence_identifier(
text: &str
) -> Result<SequenceIdentifier, IlluminaError>
Parses location information from an Illumina sequence identifier. This implementation is about 3x faster than using a regular expression.
The fields in the example identifier below have the following meaning: @M03745:11:000000000-B54L5:1:2108:4127:8949
M03745 ID of the sequencing machine
11 run count for this machine
000000000-B54L5 ID of the flow cell. "B54L5" will be printed on the flow cell in this example
1 lane number. For MiSeqs, there's only one lane
2108 the first digit is the side of the chip the second digit is the swath. For MiSeqs, this is always 1. For HiSeqs, each lane is two tiles wide, and the first pass from left-to-right is swath one, then the returning pass on the other side of the lane is swath two the last two digits are the order of the tile. For MiSeqs, this is a number from 1 to 19
4127 the x-position of the read in the tile, in arbitrary units
8949 the y-position of the read in the tile, in arbitrary units
See https://help.basespace.illumina.com/articles/descriptive/fastq-files/ for more information.
Example
extern crate illumina_coordinates; fn main() { let line = "@M03745:11:000000000-B54L5:1:2108:4127:8949"; let seq_id = illumina_coordinates::parse_sequence_identifier(&line).unwrap(); assert_eq!(seq_id.sequencer_id, "M03745".to_string()); assert_eq!(seq_id.run_count, 11); assert_eq!(seq_id.flow_cell_id, "000000000-B54L5".to_string()); assert_eq!(seq_id.lane, 1); assert_eq!(seq_id.side, 2); assert_eq!(seq_id.swath, 1); assert_eq!(seq_id.tile, 8); assert_eq!(seq_id.x, 4127); assert_eq!(seq_id.y, 8949); }