Struct umgap::commands::prot2tryp2lca::ProtToTrypToLca
source · pub struct ProtToTrypToLca {
pub one_on_one: bool,
pub fst_file: PathBuf,
pub fst_in_memory: bool,
pub chunk_size: usize,
pub pattern: String,
pub min_length: usize,
pub max_length: usize,
pub contains: String,
pub lacks: String,
}
Expand description
Digests a FASTA stream of peptides and maps all tryptic peptides to taxon IDs
The umgap prot2tryp2lca
command takes one or more peptides and splits these into
tryptic peptides, possibly filters them, and outputs their lowest common ancestors. It is a
combination of the umgap prot2tryp
, umgap filter
and umgap pept2lca
commands to allow more
efficient parallel computing (c.f. their documentation for details).
The input is given in a FASTA format on standard input with a single peptide per FASTA header, which may be hardwrapped with newlines. The command prints the lowest common ancestors for each tryptic peptide found in each given peptide to standard output.
$ cat input.fa
>header1
AYKKAGVSGHVWQSDGITNCLLRGLTRVKEAVANRDSGNGYINKVYYWTVDKRATTRDALDAGVDGIMTNYPDVITDVLN
$ umgap prot2tryp2lca tryptic-lca.index < input.fa
>header1
571525
1
571525
6920
Fields§
§one_on_one: bool
Map unknown sequences to 0 instead of ignoring them
fst_file: PathBuf
An index that maps tryptic peptides to taxon IDs
fst_in_memory: bool
Load index in memory instead of memory mapping the file contents. This makes querying significantly faster, but requires some initialization time.
chunk_size: usize
Number of reads grouped into one chunk. Bigger chunks decrease the overhead caused by multithreading. Because the output order is not necessarily the same as the input order, having a chunk size which is a multiple of 12 (all 6 translations multiplied by the two paired-end reads) will keep FASTA records that originate from the same reads together.
pattern: String
The cleavage-pattern (regex), i.e. the pattern after which the next peptide will be cleaved for tryptic peptides)
min_length: usize
Minimum length of tryptic peptides to be mapped
max_length: usize
Maximum length of tryptic peptides to be mapped
contains: String
Amino acid symbols that a peptide must contain to be processed
lacks: String
Amino acid symbols that a peptide may not contain to be processed
Trait Implementations§
source§impl Debug for ProtToTrypToLca
impl Debug for ProtToTrypToLca
source§impl StructOpt for ProtToTrypToLca
impl StructOpt for ProtToTrypToLca
source§fn from_clap(matches: &ArgMatches<'_>) -> Self
fn from_clap(matches: &ArgMatches<'_>) -> Self
clap::ArgMatches
. It’s guaranteed to succeed
if matches
originates from an App
generated by StructOpt::clap
called on
the same type, otherwise it must panic.source§fn from_args() -> Selfwhere
Self: Sized,
fn from_args() -> Selfwhere
Self: Sized,
std::env::args_os
).
Calls clap::Error::exit
on failure, printing the error message and aborting the program.source§fn from_args_safe() -> Result<Self, Error>where
Self: Sized,
fn from_args_safe() -> Result<Self, Error>where
Self: Sized,
std::env::args_os
).
Unlike StructOpt::from_args
, returns clap::Error
on failure instead of aborting the program,
so calling .exit
is up to you.source§fn from_iter<I>(iter: I) -> Self
fn from_iter<I>(iter: I) -> Self
Vec
of your making.
Print the error message and quit the program in case of failure. Read more