Crate kseq

source ·
Expand description

Crates.io Crates.io docs.rs

§kseq

kseq is a simple fasta/fastq (fastx) format parser library for Rust, its main function is to iterate over the records from fastx files (similar to kseq in C). It uses shared buffer to read and store records, so the speed is very fast. It supports a plain or gz fastx file or io::stdin, as well as a fofn (file-of-file-names) file, which contains multiple plain or gz fastx files (one per line).

Using kseq is very simple. Users only need to call parse_path to parse a path or parse_reader to parse a reader, and then use iter_record method to get each record.

  • parse_path This function takes a path that implements AsRef<std::path::Path> as input, a path can be a fastx file, - for io::stdin, or a fofn file. It returns a Result type:

    • Ok(T): A struct T with the iter_record method.
    • Err(E): An error E including missing input, can’t open or read, wrong fastx format or invalid path or file errors.
  • parse_reader This function takes a reader that implements std::io::Read as input. It returns a Result type:

    • Ok(T): A struct T with the iter_record method.
    • Err(E): An error E including missing input, can’t open or read, wrong fastx format or invalid path or file errors.
  • iter_record This function can be called in a loop, it returns a Result<Option<Record>> type:

    • Ok(Some(Record)): A struct Record with methods:

      • head -> &str: get sequence id/identifier
      • seq -> &str: get sequence
      • des -> &str: get sequence description/comment
      • sep -> &str: get separator
      • qual -> &str: get quality scores
      • len -> usize: get sequence length

      Note: call des, sep and qual will return "" if Record doesn’t have these attributes.

    • Ok(None): Stream has reached EOF.

    • Err(ParseError): An error ParseError including IO, TruncateFile, InvalidFasta or InvalidFastq errors.

§Example

use std::env::args;
use std::fs::File;
use kseq::parse_path;

fn main(){
	let path: String = args().nth(1).unwrap();
	let mut records = parse_path(path).unwrap();
	// let mut records = parse_reader(File::open(path).unwrap()).unwrap();
	while let Some(record) = records.iter_record().unwrap() {
		println!("head:{} des:{} seq:{} qual:{} len:{}", 
			record.head(), record.des(), record.seq(), 
			record.qual(), record.len());
	}
}

§Installation

cargo add kseq

§Benchmarking

cargo bench

Modules§

Enums§

  • a reader for a single path or readers for multiple paths

Functions§