cgkitten
Convert mmCIF/PDB protein structures to a coarse-grained representation.
Installation
Quick start
# Convert structure at pH 7 (default) with MC titration
# Generate topology file alongside the structure
# pH titration scan with terminal plot
Coarse-graining policies
Two policies are available via --cg (default: multi):
- multi: Each amino acid becomes one bead at its centre of mass. Titratable residues (ASP, GLU, HIS, CYS, TYR, LYS, ARG) get an additional bead at the charge centre.
- single: Each amino acid becomes exactly one bead. Titratable residues get unique numbered type names (ASP1, ASP2, ...) to carry distinct charges.
N- and C-terminal charges are always included as separate beads. Water and non-protein molecules are removed; coordinated metal ions are retained. Multi-chain structures are fully supported with per-chain terminal beads.
Convert
The convert subcommand writes coarse-grained structure and topology files.
Shared flags are placed before convert; convert-specific flags follow it.
# Explicit output file (single format)
# Single-bead coarse-graining
# Custom pH and MC sweeps
# Henderson-Hasselbalch only (no MC)
# Pipe from stdin
|
# Custom conditions
# Select specific chains
Output formats
By default, both .pqr and .xyz files are written. Use -o to write a single format:
Force field models
The --model flag selects the coarse-grained force field (default: calvados3).
Use --model none to skip force field parameters entirely.
| Model | Potential | Pair section | Reference |
|---|---|---|---|
calvados3 |
Ashbaugh-Hatch | replace: |
doi:10.1093/database/baz024 |
kimhummer / kh |
Kim-Hummer | replace:/append: |
doi:10.1016/j.jmb.2007.11.063 |
pasquier |
Lennard-Jones | replace:/append: |
doi:10.1016/j.jcis.2022.08.054 |
Pasquier: LJ interactions using Kim-Hummer sigma values and two epsilon classes (open conformation): epsilon_hh = 0.485 kT for hydrophobic-hydrophobic pairs, epsilon = 0.005 kT for all other pairs.
Hydrophobic scaling
The --scale-hydrophobic flag generates pairwise nonbonded overrides for
hydrophobic residue pairs (ALA, ILE, LEU, MET, PHE, PRO, TRP, TYR, VAL):
lambda:<factor>-- scale Ashbaugh-Hatch lambda (Calvados 3 only)epsilon:<factor>-- scale well depth epsilon (all models)
# Scale lambda by 20% (Calvados 3)
# Scale epsilon (works for all models)
For Calvados 3, parameters are mixed using Lorentz-Berthelot combining rules (arithmetic mean for sigma and lambda, geometric mean for epsilon), then the chosen quantity is scaled.
For Kim-Hummer and Pasquier, pairs are partitioned by charge: neutral under
replace: (skips redundant Coulomb), charged under append: (inherits Coulomb
from default).
Topology output
The topology YAML (topology.yaml by default, override with --top) contains
atom types with charge, mass, and force-field-specific fields. Titratable site
types with similar charges (within --merge-tol, default 2%) are merged into a
single type using their mean charge. The file header records the exact command
for reproducibility.
# Custom charge-merging tolerance
pH scan
The scan subcommand plots average net charge as a function of pH in the terminal.
Henderson-Hasselbalch is always shown; when MC is enabled (the default), the
Monte Carlo result is overlaid in a different color. MC pH points are computed
in parallel using rayon.
# Terminal plot
# Single-bead policy
# Henderson-Hasselbalch only
# Custom range and save to file
With -o, scan data is saved to a space-separated file with columns:
# pH Z(HH) Z2(HH) mu(HH) mu2(HH) [Z(MC) Z2(MC) mu(MC) mu2(MC)]
where Z is net charge, Z2 is the second moment, mu is dipole moment (e*A), and
mu2 is the dipole second moment. MC columns are included when --mc > 0.
Shared flags
| Flag | Default | Description |
|---|---|---|
--temperature |
298 | Temperature in Kelvin |
--ionic-strength |
0.1 | Ionic strength in mol/L |
--mc |
10000 | Number of MC titration sweeps (0 = Henderson-Hasselbalch only) |
--cg |
multi |
Coarse-graining policy (multi or single) |
--chain |
all | Select specific chain(s); repeat for multiple |
Monte Carlo titration
Charges are computed using Metropolis Monte Carlo titration by default
(--mc 10000), which captures many-body electrostatic coupling between
titratable sites. Use --mc 0 for the instantaneous Henderson-Hasselbalch
independent-site approximation.
Method
-
Initialization: Protonation states are set from Henderson-Hasselbalch at the given pH, providing a physically reasonable starting configuration.
-
Yukawa electrostatics: Pairwise interactions use a screened Coulomb (Yukawa) potential:
U/kT = lambda_B * q_i * sum_j z_j * exp(-r_ij / lambda_D) / r_ij
where lambda_B is the Bjerrum length (~7.1 A at 298 K in water) and lambda_D is the Debye screening length (~9.6 A at 0.1 M ionic strength). The kernel exp(-r/lambda_D)/r is precomputed once for all bead pairs.
-
Metropolis sweeps: Each sweep proposes N random protonation state changes (one per titratable site on average). A trial move toggles a site's protonation and is accepted with probability:
P = min(1, exp(-dU/kT))
where dU/kT = dq * phi_i +/- ln(10) * (pH - pKa), with phi_i being the electrostatic potential at site i from all other titratable sites plus a constant background from fixed-charge beads (metal ions).
-
Ensemble averages: The mean charge, its second moment, dipole moment, and its second moment are accumulated over all sweeps as true ensemble averages.
Library usage
use ;
let cif_data = read.unwrap;
// Multi-bead (default)
let beads = coarse_grain;
// Single-bead policy
let beads = coarse_grain_with;
// Henderson-Hasselbalch (default)
let result = new.ph.run;
println!;
// Monte Carlo titration
let result = new.ph.mc.run;
let charged_beads = result.apply;
// Serde support (with "serde" feature)
// let calc: ChargeCalc = serde_json::from_str(r#"{"ph": 7.4, "mc": 10000}"#)?;
License
Apache-2.0