Prodigal (Rust)
Pure Rust rewrite of Prodigal v2.6.3 — a prokaryotic gene prediction tool. Produces byte-identical output to the original C implementation, with no C dependencies.
The use of Rust enables the integration of Prodigal as a library in your project, and in applications such as webassembly
Installation
Or build from source:
No C compiler, zlib, or other system libraries required.
Library usage
[]
= "0.1"
Metagenomic mode (simplest)
Predict genes using pre-trained models — no training step needed:
use predict_meta;
Single genome mode
Train on the genome first, then predict on individual contigs:
use ;
Batch processing (parallel)
For processing many sequences, MetaPredictor caches the 50 models and evaluates them in parallel:
use MetaPredictor;
Custom configuration
use ;
Gene prediction results
Each PredictedGene contains:
| Field | Type | Description |
|---|---|---|
begin |
usize |
1-indexed start position |
end |
usize |
1-indexed end position (inclusive) |
strand |
Strand |
Forward or Reverse |
start_codon |
StartCodon |
ATG, GTG, TTG, or Edge |
partial |
(bool, bool) |
Left/right partial (runs off edge) |
rbs_motif |
String |
RBS motif (e.g. "AGGAG" or "None") |
rbs_spacer |
String |
RBS spacer distance |
gc_content |
f64 |
GC content of the gene |
confidence |
f64 |
Confidence score (50-100) |
score |
f64 |
Total score |
cscore |
f64 |
Coding potential score |
sscore |
f64 |
Start score |
rscore |
f64 |
RBS score |
uscore |
f64 |
Upstream composition score |
tscore |
f64 |
Start codon type score |
CLI usage
prodigal-rs accepts the same flags as the original prodigal:
# Single genome, GenBank output
# Metagenomic mode, GFF output with protein translations
# All outputs at once
# Generate and reuse a training file
Supports gzip-compressed input files transparently.
Testing
# Build the C binary first (needed only for comparison tests)
&& &&
28 tests cover: the high-level API (metagenomic prediction, single-genome training + prediction, training save/load, error handling, custom config), byte-identical CLI output vs the original C binary across all output formats and flag combinations, and struct layout verification.
Performance
~2x faster than the original C implementation (gcc -O3) on typical workloads.
License
GPL-3.0 (same as the original Prodigal).
Credits
Original Prodigal by Doug Hyatt, University of Tennessee / Oak Ridge National Lab.
Based on Prodigal commit c1e2d36 (v2.6.3).