Skip to main content

Module pubchem

Module pubchem 

Source
Expand description

PubChem REST API client.

Requires the pubchem Cargo feature.

§Purpose

Enriches a SubstanceIdentifier with structural data fetched from the PubChem PUG REST API:

  • CAS number → SMILES, InChI, InChIKey, IUPAC name, CID
  • IUPAC name → SMILES, CID, …
  • SMILES / InChIKey / InChI → CID + remaining fields

§Usage

use hs_predict::pipeline::HsPipeline;
use hs_predict::pubchem::PubChemClient;
use hs_predict::types::{ProductDescription, SubstanceIdentifier};

let pipeline = HsPipeline::new()
    .with_pubchem(PubChemClient::new());

let mut product = ProductDescription {
    identifier: SubstanceIdentifier::from_cas("1310-73-2"),
    physical_form: None,
    purity_pct: None,
    purity_type: None,
    mixture_components: None,
    intended_use: None,
    additional_context: None,
};

// Enrich: CAS 1310-73-2 → SMILES "[Na+].[OH-]", IUPAC "sodium hydroxide", …
pipeline.enrich(&mut product).await?;

// Classify as normal (SMILES now available → better matching)
let prediction = pipeline.classify(&product)?;
println!("{}", prediction.display());

§Rate limiting

PubChem allows up to 5 requests / second without an API key. PubChemClient enforces this automatically via an internal token-bucket rate limiter (governor).

§Caching

Responses are cached by PubChem CID using moka with a 24-hour TTL and a 1 000-entry capacity. The same compound looked up by different identifiers (CAS vs. InChIKey) is cached once after the first fetch.

Structs§

PubChemClient
PubChem REST API client with built-in rate limiting and in-memory caching.
PubChemClientBuilder
Builder for PubChemClient.
PubChemCompound
Compound data returned from a successful PubChem lookup.

Enums§

PubChemError
Errors produced by the PubChem API client.