pub struct JoinKmers {
pub taxon_file: PathBuf,
}
Expand description
Aggregates a TSV stream of peptides and taxon IDs
The umgap joinkmers
command takes tab-separated peptides and taxon IDs, aggregates the
taxon IDs where consecutive peptides are equal and outputs a tab-separated triple of peptide,
consensus taxon ID and taxon rank.
The input is given on standard input. If it is sorted on the first column, a complete mapping
from strings to aggregated taxa and its rank will be written to standard output. It is
meant to be used after an umgap splitkmers
and sort
, and it’s output is ideal for umgap buildindex
, but there may be further uses.
The aggregation strategy used in this command to find a consensus taxon is the hybrid approach
of the umgap taxa2agg
command, with a 95% factor. This keeps the result close to the lowest
common ancestor, but filters out some outlying taxa.
The taxonomy to be used is passed as an argument to this command. This is a preprocessed version of the NCBI taxonomy.
$ cat input.tsv
AAAAA 34924
AAAAA 30423
AAAAA 5678
BBBBBB 48890
BBBBBB 156563
$ umgap joinkmers taxons.tsv < input.tsv
AAAAA 2759 superkingdom
BBBBBB 9153 family
Fields§
§taxon_file: PathBuf
An NCBI taxonomy TSV-file as processed by Unipept
Trait Implementations§
source§impl StructOpt for JoinKmers
impl StructOpt for JoinKmers
source§fn from_clap(matches: &ArgMatches<'_>) -> Self
fn from_clap(matches: &ArgMatches<'_>) -> Self
clap::ArgMatches
. It’s guaranteed to succeed
if matches
originates from an App
generated by StructOpt::clap
called on
the same type, otherwise it must panic.source§fn from_args() -> Selfwhere
Self: Sized,
fn from_args() -> Selfwhere
Self: Sized,
std::env::args_os
).
Calls clap::Error::exit
on failure, printing the error message and aborting the program.source§fn from_args_safe() -> Result<Self, Error>where
Self: Sized,
fn from_args_safe() -> Result<Self, Error>where
Self: Sized,
std::env::args_os
).
Unlike StructOpt::from_args
, returns clap::Error
on failure instead of aborting the program,
so calling .exit
is up to you.source§fn from_iter<I>(iter: I) -> Self
fn from_iter<I>(iter: I) -> Self
Vec
of your making.
Print the error message and quit the program in case of failure. Read more