ARGenus
Taxonomic inference of antibiotic resistance genes (ARGs) from metagenomic data using flanking sequence analysis
ARGenus is a bioinformatics tool that simultaneously detects antibiotic resistance genes (ARGs) and identifies their source bacterial genera from metagenomic sequencing data. Unlike existing tools that only detect ARGs, ARGenus provides direct ARG-to-genus linkage through flanking sequence analysis.
Features
- Direct ARG-genus linkage: Identifies the bacterial source of each detected ARG
- Targeted assembly: Efficient processing through read filtering and localized assembly
- SNP verification: Filters false positives by confirming resistance-conferring mutations
- High compression: 450-fold compressed flanking database (27GB → 60MB)
- Fast processing: 5-10 minutes per sample with 16 threads
Installation
From crates.io
From source
Dependencies
ARGenus requires the following tools in your PATH:
Database Setup
ARGenus requires a flanking sequence database for genus classification. Download the pre-built database:
# Download flanking database (approximately 60MB)
Or build from source genomes (requires NCBI RefSeq data):
Usage
Basic usage
Options
argenus run [OPTIONS]
Required:
--r1 <FILE> Forward reads (FASTQ/FASTQ.gz)
--r2 <FILE> Reverse reads (FASTQ/FASTQ.gz)
--db <FILE> ARG database index (.mmi)
--fdb <FILE> Flanking sequence database (.fdb)
--output <FILE> Output TSV file
Optional:
--threads <N> Number of threads [default: 16]
--min-identity <F> Minimum identity for ARG matching [default: 0.8]
--min-coverage <F> Minimum coverage for ARG matching [default: 0.7]
--flank-identity <F> Minimum identity for genus classification [default: 0.9]
--include-wildtype Include wild-type alleles in output
Output Format
ARGenus produces a tab-delimited file with the following columns:
| Column | Description |
|---|---|
| sample | Sample identifier |
| gene | ARG gene name |
| drug_class | Antimicrobial drug class |
| genus | Assigned source genus |
| confidence | Classification confidence (mean identity) |
| specificity | Gene-genus association strength |
| identity | ARG sequence identity |
| coverage | ARG sequence coverage |
| contig_length | Length of assembled contig |
| snp_status | SNP verification result |
Workflow
- Read Filtering: Align reads against ARG database using minimap2
- De Novo Assembly: Assemble filtered reads with MEGAHIT
- Contig Extension: Extend contigs using k-mer overlap analysis
- ARG Detection: Identify ARGs in extended contigs
- Genus Classification: Classify genus using flanking sequence homology
- SNP Verification: Confirm resistance-conferring mutations
Performance
| Metric | Value |
|---|---|
| Processing time | 5-10 min/sample (16 threads) |
| Genus classification rate | ~73% |
| False positive reduction | ~72% (via SNP filtering) |
| Database size | 60 MB (compressed) |
Comparison with Other Tools
| Feature | ARGenus | RGI | KMA | AMRFinderPlus |
|---|---|---|---|---|
| ARG detection | ✓ | ✓ | ✓ | ✓ |
| Genus classification | ✓ | ✗ | ✗ | ✗ |
| SNP verification | ✓ | ✓ | ✗ | ✓ |
| Targeted assembly | ✓ | ✗ | ✗ | ✗ |
Citation
If you use ARGenus in your research, please cite:
[Citation information to be added upon publication]
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Contact
For questions and feedback, please open an issue on GitHub.