Kira mmCIF
Low-level parser for protein-focused CIF data. The crate reads _atom_site from text mmCIF and BinaryCIF (.bcif) and exposes a stable, Gemmi-inspired API with a deterministic, protein-oriented data contract.
Scope (by design):
- Reads mmCIF (STAR/CIF) and BinaryCIF (
.bcif) files. - Extracts ATOM records from
_atom_siteonly. - Single model (MODEL 1).
- AltLoc handling: accepts
.orA, ignores others. - Ignores symmetry, assemblies, validation, secondary structure, and other metadata.
Public API
Top-level entry point
use ;
let structure: Structure = read_structure?; // mmCIF
let structure: Structure = read_structure?; // BinaryCIF
Also available:
use ;
let mmcif = read_mmcif_structure?;
let bcif = read_bcif_structure?;
let forced = read_structure_with_format?;
Installation
Add to Cargo.toml:
[]
= "*"
Data model (Gemmi-inspired, Rust-native)
Enums and IDs:
; // 'A'..'Z' or 'a'..'z' => 0..25
Utility mapping (public methods):
ProteinIR adapter
This is the stable contract for downstream analysis pipelines.
Adapter usage:
use ;
let protein_ir = try_from?;
Errors
Parsing rules (strict by scope)
Required _atom_site fields:
_atom_site.group_PDB_atom_site.label_atom_id_atom_site.label_comp_id_atom_site.label_asym_id_atom_site.label_seq_id_atom_site.Cartn_x_atom_site.Cartn_y_atom_site.Cartn_z
Supported extras (optional):
_atom_site.label_alt_id(altLoc filter)_atom_site.pdbx_PDB_model_num(MODEL filter)
Filtering behavior:
- Only
group_PDB == "ATOM"is kept. - Only model
1is kept if the model column is present. - Only altLoc
.orA(and?treated as missing) is kept if the altLoc column is present. - Non-backbone atoms are ignored (
AtomName::from_label_atom_idmust match).
Ordering guarantees:
- Chains preserve original
label_asym_idordering as they appear in the file. - Residues are sorted by
label_seq_idwithin each chain. - Atoms are emitted in file order within each residue.
Non-goals
- No secondary structure, bonds, or geometry validation.
- No exposure of CIF/STAR internals.
Example
use ;
let structure = read_structure?;
let protein_ir = try_from?;
println!;
println!;
println!;
# Ok::