hs-predict
HS (Harmonized System) code prediction for chemical products.
hs-predict uses an Akinator-style interactive session — asking targeted questions one at a time — to collect just enough information to classify your product, then applies a hybrid rule-based engine to produce a six-digit HS 2022 code.
Disclaimer: Predictions are advisory only and must not be used as the sole basis for a customs declaration. Always verify with a qualified trade-compliance expert or the relevant customs authority.
Features
- Akinator-style UX — ask only what's needed; no upfront form to fill in
- Hybrid classification pipeline — mixture GRI → static rules → SMILES engine → LLM fallback (priority order)
- Physical-form awareness — same compound, different form = different HS code (e.g. NaOH solid → 2815.11, solution → 2815.12)
- 148-entry static rule table (133 compounds) — common industrial chemicals across Chapters 28, 29, 38, 72–81
- SMILES functional-group detection (v0.3) — 20 functional groups, organic/inorganic classification, heading-level hint (≤ 0.70 confidence)
- Mixture GRI classification (v0.5) — GRI 3a (same chapter), GRI 3b (essential character / dominant component > 50 % w/w), GRI 3c (last heading numerically); special-use routing for pharmaceuticals (Ch. 30), cosmetics (Ch. 33), food preparations (Ch. 21), agrochemicals (Ch. 38.08)
- Compliance risk flags (v0.5) —
GrayZoneidentifies Chapter 28/29/38 boundary cases;RecommendedAction::PriorConsultationsignals when an advance ruling (事前教示) should be requested - Batch processing (v0.5) —
classify_batch()andclassify_batch_with_llm()for multi-product workflows - IUPAC name → SMILES — auto-resolved via
chem-name-resolver - PubChem enrichment (v0.2,
pubchemfeature) — fills missing identifiers from CAS / IUPAC / SMILES - LLM integration (v0.4,
llmfeature) — trait-hook design: implementLlmClassifierwith your HTTP client; library suppliesPromptBuilder(EN/JA),LlmResponse, validation, andMockLlmClassifierfor tests - Japan tariff codes — 統計品目番号 (9-digit) included in every result, based on 実行関税率表 2026-04-01
Quick start
Interactive mode (Akinator-style)
use ;
use HsPipeline;
let mut session = new;
let pipeline = new;
let q = session.start;
println!; // "Please enter a CAS number, IUPAC name, SMILES, or InChIKey"
match session.answer?
# Ok::
Japanese session
use ClassificationSession;
use Language;
let mut session = new_ja; // Japanese prompts
let q = session.start;
println!; // "CAS番号、IUPAC名、SMILES、InChIKey のいずれかを入力してください"
Direct mode (known CAS + physical form)
use HsPipeline;
use ;
let pipeline = new;
let product = ProductDescription ;
let p = pipeline.classify?;
assert_eq!;
assert_eq!;
# Ok::
Classification pipeline
Input: ProductDescription
│
▼
┌──────────────────────────────────────────────────────────┐
│ Priority 0: Mixture GRI classifier (v0.5) │
│ GRI 3a → 3b (>50 % w/w) → 3c; special-use routing │
│ (pharmaceuticals Ch.30 / agrochemicals Ch.38.08 / …) │
└──────────────────────┬───────────────────────────────────┘
│ not a mixture
▼
┌──────────────────────────────────────────────────────────┐
│ Priority 1: User mapping (confidence = 1.0) │
│ pipeline.with_mapping("64-19-7", "291511") │
└──────────────────────┬───────────────────────────────────┘
│ miss
▼
┌──────────────────────────────────────────────────────────┐
│ Priority 2: Static rule table (133 compounds) │
│ CAS + physical form + purity → exact HS subheading │
└──────────────────────┬───────────────────────────────────┘
│ miss
▼
┌──────────────────────────────────────────────────────────┐
│ Priority 3: SMILES functional-group engine (v0.3) │
│ 20 functional groups → heading-level hint (≤ 0.70) │
└──────────────────────┬───────────────────────────────────┘
│ miss / low confidence
▼
┌──────────────────────────────────────────────────────────┐
│ Priority 4: LLM classifier (v0.4, trait hook) │
│ impl LlmClassifier for YourClient { ... } │
└──────────────────────┬───────────────────────────────────┘
│
▼
HsPrediction
{ hs_code, confidence, notes,
gray_zone, recommended_action,
jp_tariff_code, alternatives }
Mixture GRI classification (v0.5)
When ProductDescription::mixture_components is set, the pipeline applies the WCO General Rules for Interpretation (GRI) automatically:
| Step | Rule | Condition |
|---|---|---|
| 0 | Special-use routing | Pharmaceutical → Ch. 30; Cosmetic → Ch. 33; Food prep → Ch. 21; Agricultural → Ch. 38.08 |
| 1 | GRI 3a | All components fall in the same HS chapter → most specific heading |
| 2 | GRI 3b | One component > 50 % w/w → adopt that component's classification |
| 3 | GRI 3b LLM | No dominant component — delegate to LLM if available |
| 4 | GRI 3c | Last heading numerically; confidence 0.40; PriorConsultation recommended |
use ;
use HsPipeline;
let pipeline = new;
let product = ProductDescription ;
let p = pipeline.classify?;
// Agricultural use → 3808.xx (Chapter 38)
assert_eq!;
# Ok::
Compliance risk flags (v0.5)
HsPrediction now carries a gray_zone field to identify classification boundary risks:
use ;
let p = pipeline.classify?;
match p.gray_zone
if p.recommended_action == PriorConsultation
GrayZone variant |
Meaning |
|---|---|
Chapter29vs38 |
Organic compound may shift from Ch. 29 to Ch. 38 due to use/presentation |
Chapter28vs29 |
Organometallic borderline — presence of metal–carbon bond is decisive |
MixtureEssentialCharacterUnclear |
GRI 3c applied (no dominant component); formal ruling advised |
Batch processing (v0.5)
let products: = vec!;
// Synchronous batch (Priorities 0–3)
let results: = pipeline.classify_batch;
// Async batch with LLM fallback (Priority 4)
#
let results = pipeline.classify_batch_with_llm.await;
SMILES functional-group detection (v0.3)
When a SMILES string is available (from the user or auto-filled by PubChem), the engine detects the following functional groups and maps them to a Chapter 29 heading hint:
| Functional group | HS heading hint | Confidence |
|---|---|---|
| Anhydride | 29.15 | 0.65 |
| Isocyanate | 29.29 | 0.70 |
| Nitrile | 29.26 | 0.70 |
| Epoxide | 29.10 | 0.70 |
| Sulphonic acid | 29.04 | 0.68 |
| Amide | 29.24 | 0.67 |
| Aldehyde | 29.12 | 0.67 |
| Ketone | 29.14 | 0.67 |
| Carboxylic acid | 29.15 | 0.60 |
| Ester | 29.15 | 0.55 |
| Phenol | 29.07 | 0.67 |
| Alcohol | 29.05 | 0.60 |
| Amine | 29.21 | 0.63 |
| Organohalide | 29.03 | 0.65 |
| Ether | 29.09 | 0.63 |
| Thiol / Sulphide | 29.30 | 0.65 |
| Phosphate | 29.20 | 0.62 |
| Nitro | 29.04 | 0.60 |
| Inorganic (no C–C/C–H) | Ch. 28 | 0.55 |
use classify_smiles;
let r = classify_smiles.unwrap; // acetone
assert_eq!; // 29.14 ketone
LLM integration — design philosophy (v0.4)
Why a trait hook, not a built-in client
HS code errors carry legal and financial consequences. Building an LLM API client directly into the library would:
- Lock users into a specific provider (Anthropic, OpenAI, …)
- Create non-determinism in a compliance context — the same compound might return different codes on different calls
- Add secret management burden to a library (API keys in
Cargo.toml?) - Embed network latency and failure modes into a synchronous classification call
Instead, hs-predict defines a trait. You implement it with whatever HTTP client, model, and prompt customisation your application needs. The library provides the structured input and validates the output.
// v0.4 — implement this trait with your preferred LLM client
use ;
use BoxFuture;
// Attach to the pipeline — no API key stored in the library
let pipeline = new.with_llm;
let prediction = pipeline.classify_with_llm.await?;
The library provides:
LlmPrompt— pre-built system prompt + user message (product info + SMILES hints)LlmResponse— the expected return type (hs_code,confidence,rationale,alternatives)- Chapter-consistency validation (LLM code vs. SMILES engine hint)
MockLlmClassifierunder themockfeature for testing
PubChem enrichment (v0.2)
PubChem integration fills in missing identifier fields before classification. It is factual data retrieval (deterministic), not classification — a different role from the LLM fallback.
#
# async
Akinator question flow
Q1: CAS / IUPAC name / SMILES / InChIKey?
│
├─ PubChem lookup (pubchem feature) ────────────────────────────┐
│ │
▼ ▼
Q2: Is this a mixture?
│
├─ Yes ──► Q: How many components?
│ └─ For each component:
│ ├─ Q: Identifier?
│ └─ Q: Weight fraction (w/w%)?
│
└─ No ───► Q3: Physical form?
(Solid / Powder / Granules / Liquid /
Solution / Gas / Foil / Ingot / Unknown)
│
├─ Solution ──► Q: Concentration (w/w%)?
│
▼
Q4: Intended use?
(Industrial / Pharmaceutical / Agricultural /
Food / Cosmetic / Other)
│
├─ No SMILES ──► Q5: Organic or Inorganic?
│ │
│ └─ Organic ──► Q6: Functional groups?
▼
Classification pipeline (Priorities 1–4)
Supported identifiers
| Format | Example | Auto-detected |
|---|---|---|
| CAS number | 1310-73-2 |
✅ |
| IUPAC systematic name | sodium hydroxide |
✅ (fallback) |
| SMILES | [Na+].[OH-] |
✅ |
| InChI | InChI=1S/Na.H2O/h;1H/q+1;/p-1 |
✅ |
| InChIKey | HEMHJVSKTPXQMS-UHFFFAOYSA-M |
✅ |
Only IUPAC systematic names are accepted as text input. Trade names and common aliases (e.g. "caustic soda") are not supported — they cannot be reliably resolved.
Feature flags
| Flag | Enables | Extra dependencies |
|---|---|---|
| (none) | Rule-based + SMILES engine (Priorities 1–3) | — |
pubchem |
PubChem identifier enrichment | reqwest, moka, governor |
llm |
LlmClassifier trait + pipeline Priority 4 |
— |
mock |
MockLlmClassifier for unit testing |
— |
[]
= { = "0.5", = ["pubchem"] }
Example chemicals (static rule table)
| CAS | Substance | Form | HS 2022 |
|---|---|---|---|
| 1310-73-2 | Sodium hydroxide | Solid | 2815.11 |
| 1310-73-2 | Sodium hydroxide | Solution | 2815.12 |
| 7664-93-9 | Sulphuric acid | Any | 2807.00 |
| 7697-37-2 | Nitric acid | ≥ 98% | 2808.10 |
| 7697-37-2 | Nitric acid | < 98% | 2808.90 |
| 7664-41-7 | Ammonia | Gas | 2814.10 |
| 7664-41-7 | Ammonia | Solution | 2814.20 |
| 7429-90-5 | Aluminium | Ingot ≥ 99% | 7601.10 |
| 7429-90-5 | Aluminium | Powder | 7603.10 |
| 7429-90-5 | Aluminium | Foil | 7607.11 |
| 67-56-1 | Methanol | Liquid | 2905.11 |
| 64-17-5 | Ethanol | Liquid | 2207.10 |
| 67-64-1 | Acetone | Liquid | 2914.11 |
133 compounds (148 rule entries) across Chapters 28, 29, 38, 72–81. See src/rules/static_table.rs for the full list.
Roadmap
| Version | Status | Description |
|---|---|---|
| 0.1.0 | ✅ Released | Core rule engine + Akinator session + Japan tariff codes |
| 0.2.0 | ✅ Released | PubChem API integration |
| 0.3.0 | ✅ Released | SMILES functional-group detection (20 groups, Priority 3) |
| 0.4.0 | ✅ Released | LlmClassifier trait hook + PromptBuilder (EN/JA) + MockLlmClassifier + WASM |
| 0.4.1 | ✅ Released | WASM companion crate + Serialize additions |
| 0.5.0 | 🔜 Pending | Mixture GRI 3a/3b/3c · GrayZone · PriorConsultation · 133 compounds · batch · security hardening |
| 0.5.1 | 📋 Planned | npm publish · GitHub Actions CI · WASM tests |
Minimum Supported Rust Version (MSRV)
Rust 1.75.
Contributing
Bug reports, rule-table additions, and PRs are welcome.
For new entries in the static rule table, please cite the HS 2022 nomenclature chapter/note that supports the classification.
License
Licensed under either of:
at your option.