ncbitaxonomy 0.1.2

Read NCBI Taxonomy Database from files and work with NCBI Taxonomy DB


This is a Rust crate (i.e. library) for working with a local copy of the NCBI Taxonomy database. The database can be downloaded (either or taxdump.tar.gz) from the NCBI Taxonomy FTP site.

Documentation for version 0.1.0 is available at


(new in 0.1.1)

A tool to filter a NCBI RefSeq FASTA file so that only the ancestors of a given taxon are retained.

$ taxonomy_filter_refseq --help
taxonomy_filter_refseq 0.1.2
Peter van Heusden <>
Filter NCBI RefSeq FASTA files by taxonomic lineage


    -h, --help       Prints help information
    -V, --version    Prints version information

    -t, --tax_prefix <TAXONOMY_FILENAME_PREFIX>    String to prepend to names of nodes.dmp and names.dmp

    <INPUT_FASTA>      FASTA file with RefSeq sequences
    <TAXONOMY_DIR>     Directory containing the NCBI taxonomy nodes.dmp and names.dmp files
    <ANCESTOR_NAME>    Name of ancestor to use as ancestor filter
    <OUTPUT_FASTA>     Output FASTA filename (or stdout if omitted)


  • Clean up non-idiomatic code (e.g. the use of the insert_new_entry bool)
  • Add testing via CI
  • Refactor taxonomy_filter_refseq: move most code to library, add tests