Galah
Galah - Scalable dereplication and MIMAG calculation for metagenome assembled genomes.
Documentation can be found at https://wwood.github.io/galah/.
Galah aims to be a scalable metagenome assembled genome (MAG) dereplication and quality assessment method. Dereplication clusters genomes together based on their average nucleotide identity (ANI), and chooses a single member of each cluster as the representative. Quality assessment results in a MIMAG quality score for each genome, based on its completeness, contamination and the presence of rRNA and tRNA genes.
Quick install
# Install latest release via conda.
Example usage
For clustering and determining MIMAG quality scores:
For clustering a set of genomes at 95% ANI:
For clustering a set of contigs at 95% ANI:
For determining MIMAG quality scores for a set of genomes with CheckM2, Barrnap, and tRNAscan-SE:
Help
If you have any questions or need help, please open an issue.
License
Galah is developed by the Woodcroft lab at the Centre for Microbiome Research, School of Biomedical Sciences, QUT, with contributions from Samuel Aroney, Antônio Camargo, and Rhys Newell. It is licensed under GPL3 or later.
The source code is available at https://github.com/wwood/galah.
Citation
Aroney, S.T.N., Camargo, A.P., Tyson, G.W. and Woodcroft B.J. Galah: More scalable dereplication for metagenome assembled genomes. Zenodo (2024). https://doi.org/10.5281/zenodo.13637856