gdock
Information-driven protein-protein docking using a genetic algorithm
gdock is a fast protein-protein docking tool written in Rust that uses restraints and energy components to guide the docking process. It combines a genetic algorithm with physics-based scoring to find optimal protein-protein complexes.
A paper describing gdock is currently under review in the Journal of Open Source Software (JOSS).
Features
- Fast: Genetic algorithm with early stopping and elitism
- Information-driven: Uses residue restraints to guide docking
- Flexible scoring: Configurable energy weights (VDW, electrostatics, desolvation, restraints)
- Quality metrics: Optional DockQ calculation when reference structure is provided
- Clustering: FCC-based clustering to group similar solutions
Web Interface
A web interface is available at gdock.org for running docking jobs without installing anything locally.
Quick Start
# Install
# Prepare some input data
# Run docking
Most docking runs complete in ~15 seconds on standard hardware.
Installation
Or build from source:
Requires Rust 1.70 or later.
Usage
gdock has three subcommands: run, score, and restraints.
Docking (run)
Run the full genetic algorithm docking:
With a reference structure for DockQ calculation:
Additional options:
-o, --output-dir <DIR>: Output directory (default: current directory)-n, --nproc <NUM>: Number of processors (default: total - 2)--no-clust: Disable clustering--w_vdw,--w_elec,--w_desolv,--w_air: Custom energy weights
Scoring (score)
Calculate energy components without running the GA:
Generate restraints (restraints)
Generate restraints from interface contacts in a native structure:
Command reference
$ gdock -h
Fast information-driven protein-protein docking using genetic algorithms
Usage: gdock <COMMAND>
Commands:
run Run the genetic algorithm docking
score Score structures without running the GA
restraints Generate restraints from interface contacts
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help
-V, --version Print version
$ gdock run -h
Run the genetic algorithm docking
Usage: gdock run [OPTIONS] --receptor <FILE> --ligand <FILE> --restraints <PAIRS>
Options:
-r, --receptor <FILE> Receptor PDB file
-l, --ligand <FILE> Ligand PDB file
--restraints <PAIRS> Comma-separated restraint pairs receptor:ligand (e.g., 10:45,15:50)
--reference <FILE> Reference PDB file for DockQ calculation
--debug Debug mode: use DockQ as fitness (requires --reference)
-o, --output-dir <DIR> Output directory for results (default: current directory)
--no-clust Disable clustering, output best_by_score and best_by_dockq only
-n, --nproc <NUM> Number of processors to use (default: total - 2)
--w_vdw <WEIGHT> Weight for VDW energy term
--w_elec <WEIGHT> Weight for electrostatic energy term
--w_desolv <WEIGHT> Weight for desolvation energy term
--w_air <WEIGHT> Weight for AIR restraint energy term
-h, --help Print help
$ gdock score -h
Score structures without running the GA
Usage: gdock score [OPTIONS] --receptor <FILE> --ligand <FILE>
Options:
-r, --receptor <FILE> Receptor PDB file
-l, --ligand <FILE> Ligand PDB file
--restraints <PAIRS> Comma-separated restraint pairs receptor:ligand (optional)
--reference <FILE> Reference PDB file for DockQ calculation
--w_vdw <WEIGHT> Weight for VDW energy term
--w_elec <WEIGHT> Weight for electrostatic energy term
--w_desolv <WEIGHT> Weight for desolvation energy term
--w_air <WEIGHT> Weight for AIR restraint energy term
-h, --help Print help
$ gdock restraints -h
Generate restraints from interface contacts
Usage: gdock restraints [OPTIONS] --receptor <FILE> --ligand <FILE>
Options:
-r, --receptor <FILE> Receptor PDB file
-l, --ligand <FILE> Ligand PDB file
--cutoff <ANGSTROMS> Distance cutoff for interface detection (default: 5.0)
-h, --help Print help
Input Format
PDB Files
- Receptor: PDB file containing the receptor protein (single chain)
- Ligand: PDB file containing the ligand protein (single chain)
- Reference (optional): PDB file containing the native complex
Restraints
Comma-separated list of residue pairs in receptor:ligand format:
933:6,936:8,940:42
These indicate which residues should be in contact, based on experimental data or other information sources.
Output
model_X.pdb: Cluster representatives (unless--no-clust)ranked_X.pdb: Top 5 models ranked by scoremetrics.tsv: Tab-separated file with scores and metrics
Output structures can be visualized with molecular viewers such as PyMOL or ChimeraX.
Algorithm
gdock uses:
- Genetic Algorithm: Population of 150, elitism (top 5), tournament selection
- Energy Function: VDW + Electrostatics + Desolvation + AIR restraints
- Restraints: Flat-bottom potential (0-7 Angstrom) for specified residue pairs
- Early Stopping: Converges when no improvement for 10 generations
- Clustering: FCC-based clustering of final population
Testing
Run the test suite:
The test suite includes 174 tests covering parsing, energy calculations, and algorithm behavior.
Relevant repositories
gdock-benchmark: repository containing all scripts and raw data relevant to benchmarking the performance ofgdockgdock-wasm: WebAssembly bindings used in gdock.org
Contributing
Contributions are welcome! Please feel free to submit issues and pull requests on GitHub.
Before submitting a pull request, please ensure:
- All tests pass (
cargo test) - Code is formatted (
cargo fmt) - Linting passes (
cargo clippy)
Citation
If you use gdock in your research, please cite using the Zenodo DOI:
A JOSS paper is currently under review.
License
BSD Zero Clause License. See LICENSE file.