Expand description
A compact binary representation for InChI Keys.
This crate provides a space-efficient binary encoding for International Chemical Identifier (InChI) keys, reducing their size from the standard 27-byte ASCII representation to either 9 or 14 bytes. The implementation is based on the work by John Mayfield (NextMove Software): Data Compression of InChI Keys and 2D Coordinates
§InChI Key Format
An InChI key has the format: AAAAAAAAAAAAAA-BBBBBBBBFV-P
- First block (14 chars): Encoding core molecular constitution
- Second block (8 chars): Encoding advanced structural features whichever are applicable (stereochemistry, isotopic substitution, exact position of mobile hydrogens, metal ligation data)
- Flag (1 char): ‘S’ for standard, ‘N’ for non-standard
- Version (1 char): Currently always ‘A’
- Protonation (1 char): ‘N’ for neutral, or ‘A’-‘M’ for protonated states
§Binary Encoding
Standard InChI keys with the common second block UHFFFAOYSA can be packed into just 9 bytes.
All other InChI keys require 14 bytes.
§Optional Features
serde: Enable serialization/deserialization support. When enabled,InChIKeyserializes as a string in human-readable formats (JSON, YAML) and uses the compact binary representation in binary formats (bincode, MessagePack).
§Example
use zinchi::InChIKey;
let key: InChIKey = "ZZJLMZYUGLJBSO-UHFFFAOYSA-N".parse().unwrap();
let packed = key.packed_bytes(); // 9 or 14 bytes
let unpacked = InChIKey::unpack_from(&packed).unwrap();
assert_eq!(key, unpacked);Structs§
- InChI
Key - A compact binary representation of an InChI key.
Enums§
- InChI
KeyParse Error - Error type for InChI key parsing operations.