Crate pdbtbx[−][src]
pdbtbx (PDB Toolbox)
A library to work with crystallographic Protein DataBank files. It can parse the main part
of the PDB format (it is actively in development so more will follow). After parsing the
structure is accessible with an API loosely based on CCTBX [Grosse-Kunstleve, R. W. et al
]. The resulting structures can
be saved in a valid PDB file for use in other software.
Goals
This library is designed to be a dependable, safe, stable and fast way of handling PDB files in idiomatic Rust. It is the goal to be very community driven, to make it into a project that is as useful to everyone as possible, while keeping true to its core principles.
Why
As Rust is a very recent language there is not a lot of support for scientific work in Rust
in comparison to languages that are used much longer (like the ubiquitous Python). I think
that using Rust would have huge benefits over other languages in bigger scientific
projects. It is not just me, more scientists are turning to Rust [Perkel, J. M.
]. To help support this
movement writing this library that makes more scientific work with Rust possible I want to
make it easier for scientists to start using Rust.
How to use it
The following example opens a pdb file (1ubq.pdb
). Removes all H
atoms. Calculates the
average B factor (or temperature factor) and prints that. It also saves the resulting PDB
to a file.
use pdbtbx; let (mut pdb, _errors) = pdbtbx::open("example-pdbs/1ubq.pdb", pdbtbx::StrictnessLevel::Medium).unwrap(); pdb.remove_atoms_by(|atom| atom.element() == "H"); // Remove all H atoms let mut avg_b_factor = 0.0; for atom in pdb.atoms() { // Iterate over all atoms in the structure (not the HETATMs) avg_b_factor += atom.b_factor(); } avg_b_factor /= pdb.atom_count() as f64; println!("The average B factor of the protein is: {}", avg_b_factor); pdbtbx::save(pdb, "dump/1ubq.pdb", pdbtbx::StrictnessLevel::Loose);
PDB Hierarchy
As explained in depth in the documentation of CCTBX
it can be quite hard to properly define a hierarchy for PDB files which works for all files.
This library follows the hierarchy presented by CCTBX, but renames the residue_group
and
atom_group
constructs. This gives the following hierarchy, with the main identifying characteristics annotated per level.
Iterating over the PDB Hierarchy
// Iterating over all levels for model in pdb.models() { for chain in model.chains() { for residue in chain.residues() { for conformer in residue.conformers() { for atom in conformer.atoms() { // Do the calculations } } } } } // Or only over a couple of levels (just like in the example above) for residue in pdb.residues() { for atom in residue.atoms() { // Do the calculations } }
References
- [
Grosse-Kunstleve, R. W. et al
] Grosse-Kunstleve, R. W., Sauter, N. K., Moriarty, N. W., & Adams, P. D. (2002). TheComputational Crystallography Toolbox: crystallographic algorithms in a reusable software framework. Journal of Applied Crystallography, 35(1), 126–136. https://doi.org/10.1107/s0021889801017824 - [
Perkel, J. M.
] Perkel, J. M. (2020). Why scientists are turning to Rust. Nature, 588(7836), 185–186. https://doi.org/10.1038/d41586-020-03382-2
Structs
Atom | A struct to represent a single Atom in a protein |
Chain | A Chain containing multiple Residues |
Conformer | A Conformer of a Conformer containing multiple atoms, analogous to ‘atom_group’ in cctbx |
DatabaseReference | A DatabaseReference containing the cross-reference to a corresponding database sequence for a Chain. |
Model | A Model containing multiple Chains |
MtriX | A transformation expressing non-crystallographic symmetry, used when transformations are required to generate the whole asymmetric subunit |
PDB | A PDB file containing the 3D coordinates of many atoms making up the 3D structure of a protein, but it can also be used for other molecules. |
PDBError | An error surfacing while handling a PDB |
Position | A position in a file for use in parsing/lexing |
Residue | A Residue containing multiple Residues |
SequenceDifference | A difference between the sequence of the database and the pdb file |
SequencePosition | The position of the sequence for a cross-reference of sequences. |
Symmetry | A Space group of a crystal |
TransformationMatrix | A 3D affine transformation matrix |
UnitCell | A unit cell of a crystal, containing its dimensions and angles |
Enums
Context | A struct to define the context of an error message |
ErrorLevel | This indicates the level of the error, to handle it differently based on the level of the raised error. |
StrictnessLevel | The strictness to operate in, this defines at which ErrorLevel the program should stop execution upon finding an error. |
Functions
open | Open an atomic data file, either PDB or mmCIF/PDBx. The correct type will be determined based on the extension of the file. Returns an PDBError when it found a BreakingError. Otherwise it returns the PDB with all errors/warnings found while parsing it. |
open_mmcif | Parse the given mmCIF file into a PDB struct. Returns an PDBError when it found a BreakingError. Otherwise it returns the PDB with all errors/warnings found while parsing it. |
open_pdb | Parse the given file into a PDB struct. Returns an PDBError when it found a BreakingError. Otherwise it returns the PDB with all errors/warnings found while parsing it. |
open_pdb_raw | Parse the input stream into a PDB struct. To allow for direct streaming from sources, like from RCSB.org. Returns an PDBError when it found a BreakingError. Otherwise it returns the PDB with all errors/warnings found while parsing it. |
save | Save the given PDB struct to the given file.
It validates the PDB. It fails if the validation fails with the given |
save_mmcif | Save the given PDB struct to the given file as mmCIF or PDBx.
It validates the PDB. It fails if the validation fails with the given |
save_mmcif_raw | Save the given PDB struct to the given BufWriter. It does not validate or renumber the PDB, so if that is needed that needs to be done in preparation. It does change the output format based on the StrictnessLevel given. |
save_pdb | Save the given PDB struct to the given file.
It validates the PDB. It fails if the validation fails with the given |
save_pdb_raw | Save the given PDB struct to the given BufWriter. It does not validate or renumber the PDB, so if that is needed that needs to be done in preparation. It does change the output format based on the StrictnessLevel given. |
validate | Validate a given PDB file in terms of invariants that should be held up. It returns PDBErrors with the warning messages. |
validate_pdb | Validates this models specifically for the PDB format |