Function illumina_coordinates::parse_sequence_identifier [] [src]

pub fn parse_sequence_identifier(
    text: &str
) -> Result<SequenceIdentifier, IlluminaError>

Parses location information from an Illumina sequence identifier. This implementation is about 3x faster than using a regular expression.

The fields in the example identifier below have the following meaning: @M03745:11:000000000-B54L5:1:2108:4127:8949

M03745 ID of the sequencing machine

11 run count for this machine

000000000-B54L5 ID of the flow cell. "B54L5" will be printed on the flow cell in this example

1 lane number. For MiSeqs, there's only one lane

2108 the first digit is the side of the chip the second digit is the swath. For MiSeqs, this is always 1. For HiSeqs, each lane is two tiles wide, and the first pass from left-to-right is swath one, then the returning pass on the other side of the lane is swath two the last two digits are the order of the tile. For MiSeqs, this is a number from 1 to 19

4127 the x-position of the read in the tile, in arbitrary units

8949 the y-position of the read in the tile, in arbitrary units

See https://help.basespace.illumina.com/articles/descriptive/fastq-files/ for more information.

Example

extern crate illumina_coordinates;

fn main() {
    let line = "@M03745:11:000000000-B54L5:1:2108:4127:8949";
    let seq_id = illumina_coordinates::parse_sequence_identifier(&line).unwrap();
    assert_eq!(seq_id.sequencer_id, "M03745".to_string());
    assert_eq!(seq_id.run_count, 11);
    assert_eq!(seq_id.flow_cell_id, "000000000-B54L5".to_string());
    assert_eq!(seq_id.lane, 1);
    assert_eq!(seq_id.side, 2);
    assert_eq!(seq_id.swath, 1);
    assert_eq!(seq_id.tile, 8);
    assert_eq!(seq_id.x, 4127);
    assert_eq!(seq_id.y, 8949);
}