h2m-cli 0.2.2

HTML to Markdown converter.
h2m-cli-0.2.2 is not a library.

H2M

Crates.io Docs.rs CI License Rust

Fast, extensible HTML-to-Markdown converter for Rust — CommonMark + GFM, plugin architecture, zero unsafe.

H2M converts HTML into clean Markdown with full CommonMark compliance and GitHub Flavored Markdown extensions. It uses a plugin-based rule system, supports reference-style links, relative URL resolution, and ships with a CLI that can fetch and convert web pages directly.

Quick Start

Install the CLI

Shell (macOS / Linux):

curl -fsSL https://sh.qntx.fun/labs/h2m | sh

PowerShell (Windows):

irm https://sh.qntx.fun/labs/h2m/ps | iex

Or via Cargo:

cargo install h2m-cli

CLI Usage

# Convert a URL directly
h2m https://example.com

# Extract only the article content
h2m --selector article https://blog.example.com/post

# Local file with GFM + referenced links, save to file
h2m --gfm --link-style referenced page.html -o output.md

# Pipe from stdin
curl -s https://example.com | h2m --selector main

# All formatting options
h2m --gfm --heading-style setext --strong underscores --fence tilde page.html

Library Usage

// One-liner with CommonMark defaults
let md = h2m::convert("<h1>Hello</h1><p>World</p>");
assert_eq!(md, "# Hello\n\nWorld");
// Full control with builder
use h2m::{Converter, Options};
use h2m::plugins::Gfm;
use h2m::rules::CommonMark;

let converter = Converter::builder()
    .options(Options::default())
    .use_plugin(CommonMark)
    .use_plugin(Gfm)
    .domain("example.com")
    .build();

let md = converter.convert(r#"<a href="/about">About</a>"#);
assert_eq!(md, "[About](http://example.com/about)");

Design

  • CommonMark compliant — headings, paragraphs, emphasis, strong, code blocks, links, images, lists, blockquotes, horizontal rules, line breaks
  • GFM extensions — tables (with column alignment), strikethrough, task lists
  • Reference-style links — full ([text][1]), collapsed ([text][]), and shortcut ([text]) styles
  • Domain resolution — resolve relative URLs to absolute via the url crate (WHATWG compliant)
  • Plugin architecture — extend with custom rules via the Rule trait; register with Converter::builder().use_plugin()
  • Keep / Remove — selectively preserve raw HTML tags or strip them entirely
  • CSS selector extraction — CLI --selector flag to convert only matching elements
  • Zero-copy fast pathsCow<str> for escaping and whitespace normalization; no allocation when input needs no transformation
  • Send + SyncConverter is immutable after build, safe to share across threads (compile-time assertion)
  • Strict linting — Clippy pedantic + nursery + correctness (deny), zero warnings

Conversion Examples

Input HTML:

<h1>Title</h1>
<p>A <strong>bold</strong> and <em>italic</em> paragraph with <a href="https://example.com">a link</a>.</p>
<ul>
  <li>First item</li>
  <li>Second item</li>
</ul>
<pre><code class="language-rust">fn main() {}</code></pre>

Output Markdown:

# Title

A **bold** and *italic* paragraph with [a link](https://example.com).

- First item
- Second item

​```rust
fn main() {}
​```

Custom Rules

Extend the converter with your own rules by implementing the Rule trait:

use h2m::{Converter, Rule, Action, Context};
use h2m::rules::CommonMark;
use scraper::ElementRef;

struct HighlightRule;

impl Rule for HighlightRule {
    fn tags(&self) -> &'static [&'static str] { &["mark"] }

    fn apply(&self, content: &str, _el: &ElementRef<'_>, _ctx: &mut Context) -> Action {
        Action::Replace(format!("=={content}=="))
    }
}

let converter = Converter::builder()
    .use_plugin(CommonMark)
    .build();

License

Licensed under either of:

at your option.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this project shall be dual-licensed as above, without any additional terms or conditions.


A QNTX open-source project.

Code is law. We write both.