nom-pdb 0.0.2

PDB parser implemented with nom
Documentation

nom-pdb

CI

PDB parser implemented in Rust using nom.

Features

  • Parses structural information and a subset of important metadata.
    • Primary structure
    • Secondary structure (sheets and helices)
    • Coordinates and bonding
  • Able to deal with non-standard residues (not yet mature)
  • JSON serialization powered by serde.

Example (Last Updated 2020-10-17)

cargo run --example read 1a8o
{
  "header": {
    "classification": "VIRAL PROTEIN",
    "deposition_date": "1998-03-27",
    "id_code": "1A8O"
  },
  "title": "HIV CAPSID C-TERMINAL DOMAIN",
  "authors": [
    "T.R.GAMBLE",
    "S.YOO",
    "F.F.VAJDOS",
    "U.K.VON SCHWEDLER",
    "D.K.WORTHYLAKE",
    "H.WANG",
    "J.P.MCCUTCHEON",
    "W.I.SUNDQUIST",
    "C.P.HILL"
  ],
  "experimental_techniques": [
    "XRayDiffraction"
  ],
  "cryst1": {
    "a": 41.98,
    "b": 41.98,
    "c": 88.92,
    "alpha": 90.0,
    "beta": 90.0,
    "gamma": 90.0,
    "lattice_type": "Primitive",
    "space_group": [
      [
        4,
        3
      ],
      [
        2,
        1
      ],
      [
        2,
        1
      ]
    ],
    "z": 8
  },
  "modres": {
    "MSE": {
      "standard_res": "Met",
      "description": "SELENOMETHIONINE",
      "occurence": [
        [
          "A",
          151
        ],
        [
          "A",
          185
        ],
        [
          "A",
          214
        ],
        [
          "A",
          215
        ]
      ]
    }
  },
  "seqres": [
    [
      "A",
      [
        {
          "Custom": "MSE"
        },
        "Asp",
        "Ile",
        "Arg",
        "Gln",
        "Gly",
        "Pro",
    // snip //
      ]
    ]
  ],
  "models": [
    {
      "atoms":  [
          "id": 1,
          "name": "N",
          "id1": " ",
          "residue": "Ser",
          "chain": "A",
          "sequence_number": 0,
          "insertion_code": " ",
          "x": -12.138,
          "y": 1.867,
          "z": 20.782,
          "occupancy": 1.0,
          "temperature_factor": 67.46,
          "element": "N",
          "charge": 0,
          "hetatom": false
        },
        // snip //
      ]
      "anisou": [
        // snip //
      ],
      "sheets": [
        {
          "id": "A",
          "strands": [
            {
              "start": [
                "A",
                34
              ],
              "end": [
                "A",
                38
              ],
              "sense": "Unknown"
            },
            // snip //
          ]
        },
        // snip //
      ]
      "helices": [
        // snip
      ],
      "connect": [
        // snip //
      ]
    }
  ]
}

Notes

References

Roadmap

Note: Priority is, and should be placed on parsing structural information instead of metadata, since the latter is more or less disordered free-text and usually not of particular interest to users (even in cases where they are, users can examine the PDB file directly).

Title Section

Primary Structure Section

Heterogen Section

Secondary Structure Section

Connectivity Annotation Section

Miscellaneous Features Section

Crystallographic and Coordinate Transformation Section

Coordinate Section

Connectivity Section

Bookkeeping Section