pub struct PeptToLca {
pub one_on_one: bool,
pub fst_file: PathBuf,
pub fst_in_memory: bool,
pub chunk_size: usize,
}
Expand description
Maps a FASTA stream of peptides to taxon IDs
The umgap pept2lca
command takes one or more amino acid sequences and looks up the
corresponding taxon ID in an index file (as build by the umgap buildindex
command).
The input is given in FASTA format on standard input. Per FASTA header, there can be multiple sequences, each on a line. In the following example we match tryptic peptides on their lowest common ancestor in the NCBI taxonomy.
$ cat input.fa
>header1
AAALTER
ENFVYLAK
$ umgap pept2lca tryptic-peptides.index < input.fa
>header1
2
3398
By default, sequences not found in the index are ignored. Using the -o
(--on-on-one
) flag,
they are mapped to 0, instead.
$ cat input.fa
>header1
NOTATRYPTICPEPTIDE
ENFVYLAK
$ umgap pept2lca -o tryptic-peptides.index < input.fa
>header1
0
3398
Fields§
§one_on_one: bool
Map unknown sequences to 0 instead of ignoring them
fst_file: PathBuf
An index that maps peptides to taxon IDs
fst_in_memory: bool
Load index in memory instead of memory mapping the file contents. This makes querying significantly faster, but requires some initialization time.
chunk_size: usize
Number of reads grouped into one chunk. Bigger chunks decrease the overhead caused by multithreading. Because the output order is not necessarily the same as the input order, having a chunk size which is a multiple of 12 (all 6 translations multiplied by the two paired-end reads) will keep FASTA records that originate from the same reads together.
Trait Implementations§
source§impl StructOpt for PeptToLca
impl StructOpt for PeptToLca
source§fn from_clap(matches: &ArgMatches<'_>) -> Self
fn from_clap(matches: &ArgMatches<'_>) -> Self
clap::ArgMatches
. It’s guaranteed to succeed
if matches
originates from an App
generated by StructOpt::clap
called on
the same type, otherwise it must panic.source§fn from_args() -> Selfwhere
Self: Sized,
fn from_args() -> Selfwhere
Self: Sized,
std::env::args_os
).
Calls clap::Error::exit
on failure, printing the error message and aborting the program.source§fn from_args_safe() -> Result<Self, Error>where
Self: Sized,
fn from_args_safe() -> Result<Self, Error>where
Self: Sized,
std::env::args_os
).
Unlike StructOpt::from_args
, returns clap::Error
on failure instead of aborting the program,
so calling .exit
is up to you.source§fn from_iter<I>(iter: I) -> Self
fn from_iter<I>(iter: I) -> Self
Vec
of your making.
Print the error message and quit the program in case of failure. Read more