pub struct Taxonomy {
    pub taxon_file: PathBuf,
    pub all_ranks: bool,
    pub no_header: bool,
}
Expand description

Includes info in a stream of taxon IDs

The umgap taxonomy command takes one or more taxon IDs as input, searches for them in a taxonomy and outputs more information about them in a TSV format.

The input is given on standard input and may be any sequence of FASTA headers and/or lines containing a single taxon ID. A TSV header is printed to standard output. The FASTA headers (any line starting with a >) are just copied over. Each of the taxon IDs on the other lines is looked up in a taxonomy, and the ID, name and rank of the taxon are written out separated by commas.

A taxonomy file must be passed as argument.

$ cat input.fa
2026807
888268
186802
1598
1883
$ umgap taxonomy taxons.tsv < input.fa
taxon_id	taxon_name	taxon_rank
2026807	Zetaproteobacteria bacterium	species
888268	Dichanthelium oligosanthes	species
186802	Clostridiales	order
1598	Lactobacillus reuteri	species
1883	Streptomyces	genus

The -H flag can be used to suppress the TSV header, for instance when dealing with FASTA input.

$ cat input2.fa
>header1
2026807
888268
186802
1598
1883
$ umgap taxonomy -H taxons.tsv < input2.fa
>header1
2026807	Zetaproteobacteria bacterium	species
888268	Dichanthelium oligosanthes	species
186802	Clostridiales	order
1598	Lactobacillus reuteri	species
1883	Streptomyces	genus

The -a flag can be used to request a complete ranked lineage.

$ cat input3.fa
888268
$ umgap taxonomy -a taxons.tsv < input3.fa
taxon_id	taxon_name	taxon_rank	superkingdom_id	superkingdom_name	kingdom_id	kingdom_name	subkingdom_id	subkingdom_name	superphylum_id	superphylum_name	phylum_id	phylum_name	subphylum_id	subphylum_name	superclass_id	superclass_name	class_id	class_name	subclass_id	subclass_name	infraclass_id	infraclass_name	superorder_id	superorder_name	order_id	order_name	suborder_id	suborder_name	infraorder_id	infraorder_name	parvorder_id	parvorder_name	superfamily_id	superfamily_name	family_id	family_name	subfamily_id	subfamily_name	tribe_id	tribe_name	subtribe_id	subtribe_name	genus_id	genus_name	subgenus_id	subgenus_name	species_group_id	species_group_name	species_subgroup_id	species_subgroup_name	species_id	species_name	subspecies_id	subspecies_name	varietas_id	varietas_name	forma_id	forma_name
888268	Dichanthelium oligosanthes	species	2759	Eukaryota	33090	Viridiplantae					35493	Streptophyta	131221	Streptophytina			3398	Magnoliopsida	1437197	Petrosaviidae					38820	Poales									4479	Poaceae	147369	Panicoideae	147428	Paniceae	1648011	Dichantheliinae	161620	Dichanthelium							888268	Dichanthelium oligosanthes						

Fields§

§taxon_file: PathBuf

An NCBI taxonomy TSV-file as processed by Unipept

§all_ranks: bool

Show the full lineage of a taxon. Ranks below the given taxon whill be empty.

§no_header: bool

Do not output the TSV header

Trait Implementations§

source§

impl Debug for Taxonomy

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl StructOpt for Taxonomy

source§

fn clap<'a, 'b>() -> App<'a, 'b>

Returns clap::App corresponding to the struct.
source§

fn from_clap(matches: &ArgMatches<'_>) -> Self

Builds the struct from clap::ArgMatches. It’s guaranteed to succeed if matches originates from an App generated by StructOpt::clap called on the same type, otherwise it must panic.
source§

fn from_args() -> Self
where Self: Sized,

Builds the struct from the command line arguments (std::env::args_os). Calls clap::Error::exit on failure, printing the error message and aborting the program.
source§

fn from_args_safe() -> Result<Self, Error>
where Self: Sized,

Builds the struct from the command line arguments (std::env::args_os). Unlike StructOpt::from_args, returns clap::Error on failure instead of aborting the program, so calling .exit is up to you.
source§

fn from_iter<I>(iter: I) -> Self
where Self: Sized, I: IntoIterator, <I as IntoIterator>::Item: Into<OsString> + Clone,

Gets the struct from any iterator such as a Vec of your making. Print the error message and quit the program in case of failure. Read more
source§

fn from_iter_safe<I>(iter: I) -> Result<Self, Error>
where Self: Sized, I: IntoIterator, <I as IntoIterator>::Item: Into<OsString> + Clone,

Gets the struct from any iterator such as a Vec of your making. Read more

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

§

impl<T> Pointable for T

§

const ALIGN: usize = _

The alignment of pointer.
§

type Init = T

The type for initializers.
§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.