Expand description
§kseq
kseq is a simple fasta/fastq (fastx) format parser library for Rust, its main function is to iterate over the records from fastx files (similar to kseq in C). It uses shared buffer to read and store records, so the speed is very fast. It supports a plain or gz fastx file or io::stdin, as well as a fofn (file-of-file-names) file, which contains multiple plain or gz fastx files (one per line).
Using kseq is very simple. Users only need to call parse_path to parse a path or parse_reader to parse a reader, and then use iter_record method to get each record.
-
parse_pathThis function takes a path that implementsAsRef<std::path::Path>as input, a path can be afastxfile,-forio::stdin, or afofnfile. It returns aResulttype:Ok(T): A structTwith theiter_recordmethod.Err(E): An errorEincluding missing input, can’t open or read, wrong fastx format or invalid path or file errors.
-
parse_readerThis function takes a reader that implementsstd::io::Readas input. It returns aResulttype:Ok(T): A structTwith theiter_recordmethod.Err(E): An errorEincluding missing input, can’t open or read, wrong fastx format or invalid path or file errors.
-
iter_recordThis function can be called in a loop, it returns aResult<Option<Record>>type:-
Ok(Some(Record)): A structRecordwith methods:head -> &str: get sequence id/identifierseq -> &str: get sequencedes -> &str: get sequence description/commentsep -> &str: get separatorqual -> &str: get quality scoreslen -> usize: get sequence length
Note: call
des,sepandqualwill return""ifRecorddoesn’t have these attributes. -
Ok(None): Stream has reachedEOF. -
Err(ParseError): An errorParseErrorincludingIO,TruncateFile,InvalidFastaorInvalidFastqerrors.
-
§Example
use std::env::args;
use std::fs::File;
use kseq::parse_path;
fn main(){
let path: String = args().nth(1).unwrap();
let mut records = parse_path(path).unwrap();
// let mut records = parse_reader(File::open(path).unwrap()).unwrap();
while let Some(record) = records.iter_record().unwrap() {
println!("head:{} des:{} seq:{} qual:{} len:{}",
record.head(), record.des(), record.seq(),
record.qual(), record.len());
}
}§Installation
cargo add kseq§Benchmarking
cargo benchModules§
Enums§
- Paths
- a reader for a single path or readers for multiple paths
Functions§
- parse_
path - parse path to a Reader or Readers
- parse_
reader - parse reader to a Reader or Readers