chematic
A pure-Rust cheminformatics library targeting RDKit feature parity, with no C/C++ FFI.
Design Goals
Pure Rust, zero C/C++ FFI No rdkit-sys, no openbabel bindings. Every algorithm is implemented in safe Rust.
WASM-compatible and lightweight
Core crates compile to wasm32-unknown-unknown without modification. Binary size is in
the hundreds of KB range, versus tens of MB for C++ FFI wrappers.
Domain-specific algorithms Rather than wrapping a generic graph library, chematic implements chemistry-specific algorithms directly: Kekulization, Hückel aromaticity, CIP stereochemistry, SSSR ring perception.
Reproducible and deterministic Fingerprints use FNV-1a hashing with a fixed invariant ordering. Given the same SMILES input, the same bits are always produced. No RNG, no platform-specific behavior.
Current Status
Phases 1–3 and Phase 5 (coordinate generation + file I/O) are complete. Phase 4 (MACCS, topological path, MCS, tautomer normalization) is also done. 332 tests, all passing.
| Crate | Description | Tests |
|---|---|---|
chematic-core |
Atom, Bond, Molecule, Element, kekulization (no deps) | 30 |
chematic-smiles |
OpenSMILES parser, writer, canonical SMILES | 50 |
chematic-perception |
SSSR (Balducci-Pearlman), Huckel aromaticity | 14 |
chematic-mol |
MOL/SDF V2000+V3000 parser and writer | 36 |
chematic-depict |
2D SVG depiction (ring+chain templates) | 14 |
chematic-chem |
Descriptors, standardization (salt strip, charge), Murcko scaffold, CIP | 67 |
chematic-fp |
ECFP4/ECFP6, MACCS 166-bit keys, topological path FP, Tanimoto/Dice | 31 |
chematic-smarts |
SMARTS parser, VF2 subgraph isomorphism, MCS | 46 |
chematic-3d |
3D coordinate generation, PDB/XYZ file formats | 15 |
chematic-rxn |
Reaction SMILES parser and writer | 15 |
chematic |
Umbrella crate with feature flags (all sub-crates) | 1 |
cargo test --workspace # 332 tests, all passing
Quick Start
Using the umbrella crate
# Cargo.toml
[]
= { = "https://github.com/kent-tokyo/chematic", = ["smiles", "fp"] }
// Using the umbrella crate
use ;
use ecfp4;
// chematic = { version = "0.1.0", features = ["smiles", "fp"] }
Using individual crates
# Cargo.toml
[]
= { = "https://github.com/kent-tokyo/chematic" }
= { = "https://github.com/kent-tokyo/chematic" }
= { = "https://github.com/kent-tokyo/chematic" }
use ;
use ;
use ;
SMARTS substructure search
use parse;
use ;
let mol = parse.unwrap; // aspirin
let query = parse_smarts.unwrap;
let matches = find_matches;
println!; // 2
Molecular descriptors
use parse;
use ;
let aspirin = parse.unwrap;
println!; // ~180.16
println!; // ~63.6
println!; // true
2D depiction
use parse;
use depict_svg;
let caffeine = parse.unwrap;
let svg = depict_svg;
write.unwrap;
Comparison with Other Cheminformatics Libraries
| Feature | chematic | RDKit (rdkit-sys) | OpenBabel FFI | chemcore / purr |
|---|---|---|---|---|
| Language | Pure Rust | Rust + C++ FFI | Rust + C++ FFI | Pure Rust |
| WASM target | Yes | No | No | Partial |
| Binary size (core) | ~500 KB | ~50 MB | ~20 MB | ~200 KB |
| OpenSMILES parser | Full | Full | Full | Partial |
| SMILES writer | Yes | Yes | Yes | No |
| Canonical SMILES | Yes | Yes | Yes | No |
| Kekulization | Yes | Yes | Yes | No |
| Aromaticity perception | Yes (Huckel) | Yes | Yes | Partial |
| Ring perception (SSSR) | Yes | Yes | Yes | No |
| SDF/MOL V2000 | Yes | Yes | Yes | No |
| SDF/MOL V3000 | Yes | Yes | Yes | No |
| 2D depiction (SVG) | Yes | Yes | Yes | No |
| ECFP fingerprints | Yes (ECFP4/6) | Yes | Yes | No |
| SMARTS / substructure search | Yes (VF2) | Yes | Yes | No |
| Molecular descriptors | Yes (MW/LogP/TPSA/...) | Yes | Yes | No |
| 3D coordinate generation | Yes (rule-based) | Yes (ETKDG) | Yes | No |
| PDB/XYZ file formats | Yes | Yes | Yes | No |
| CIP stereochemistry (R/S) | Yes (R/S, E/Z) | Yes | Yes | No |
| MACCS fingerprints | Yes (166-bit keys) | Yes | Yes | No |
| Force field minimization | Yes (rule-based) | Yes (UFF/MMFF) | Yes | No |
| Reaction SMILES/SMIRKS | Yes | Yes | Yes | No |
| Unsafe Rust | None | Extensive | Extensive | None |
| Maintenance (2026) | Active | Active | Minimal | Archived |
Notes:
- "chematic" column reflects current implementation plus the final planned state.
- Binary sizes are approximate and depend on enabled features.
- chemcore and purr are archived; chematic supersedes their scope.
Roadmap
Phase 1 — Foundation (complete)
Core types, OpenSMILES parse/write, Kekulization, canonical SMILES. 80 tests.
Phase 2 — Molecular Perception (complete)
SSSR, Huckel aromaticity, SDF/MOL V2000+V3000, 2D SVG depiction. 63 tests.
Phase 3 — Chemical Intelligence (complete)
Descriptors (MW, LogP, TPSA, Lipinski), ECFP4/6 fingerprints, SMARTS+VF2, molecular standardization (salt stripping, charge neutralization), Murcko scaffold, CIP R/S and E/Z stereochemistry assignment.
Phase 4 — Similarity and Search (complete)
MACCS 166-bit structural keys ✓, topological path fingerprints ✓, MCS ✓, tautomer normalization ✓.
Phase 5 — 3D Chemistry (partially complete)
Rule-based 3D coordinate generation, PDB/XYZ formats. Remaining: UFF force field minimization.
Phase 6 — RDKit Parity (partially complete)
Reaction SMILES/SMIRKS (chematic-rxn) ✓, umbrella crate with feature flags (chematic) ✓. Remaining: WASM package (npm: chematic), ChEMBL-scale validation.
See tasks/todo.md for the detailed per-task breakdown.
Repository Structure
chematic/
├── Cargo.toml workspace root
├── CHANGELOG.md version history
├── crates/
│ ├── chematic-core/ Atom, Bond, Molecule, Element, kekulization
│ ├── chematic-smiles/ OpenSMILES parser, writer, canonical SMILES
│ ├── chematic-perception/ SSSR ring perception, Huckel aromaticity
│ ├── chematic-mol/ MOL/SDF V2000+V3000 parser and writer
│ ├── chematic-depict/ 2D SVG depiction engine
│ ├── chematic-chem/ Molecular descriptors, standardization, scaffold
│ ├── chematic-fp/ ECFP4/6 fingerprints, Tanimoto/Dice similarity
│ ├── chematic-smarts/ SMARTS parser + VF2 subgraph isomorphism, MCS
│ ├── chematic-3d/ 3D coordinate generation, PDB/XYZ formats
│ ├── chematic-rxn/ Reaction SMILES parser and writer
│ └── chematic/ Umbrella crate with feature flags
└── tasks/
├── todo.md full roadmap checklist (Japanese)
└── lessons.md development lessons learned
Development Commands
License
Licensed under either of Apache License 2.0 or MIT License, at your option.