html2markdown 0.2.0

HTML to Markdown converter using AST-to-AST transformation
Documentation

html2markdown

Crates.io License: MIT Rust: 1.80+

HTML to Markdown converter using AST-to-AST transformation.

Ports the architecture and test cases from hast-util-to-mdast (transformer) and mdast-util-to-markdown (serializer).

Usage

Add to your Cargo.toml:

[dependencies]
html2markdown = "0.2"
let md = html2markdown::convert("<h1>Hello</h1><p>World</p>");
assert_eq!(md, "# Hello\n\nWorld");

With options

use html2markdown::{convert_with, Options, HeadingStyle};

let opts = Options::new().with_heading_style(HeadingStyle::Setext);

let md = convert_with("<h1>Hello</h1><p>World</p>", &opts);
assert_eq!(md, "Hello\n=====\n\nWorld");

What it handles

  • Headings, paragraphs, blockquotes, lists (ordered, unordered, task lists)
  • Inline formatting: bold, italic, strikethrough, code
  • Links, images, and reference-style links
  • Tables (with alignment)
  • Code blocks (fenced) with language hints
  • Horizontal rules, line breaks
  • Nested structures and edge cases from 130 fixture tests
  • Context-sensitive escaping to prevent false Markdown syntax

Architecture

The conversion is a two-phase pipeline:

  1. HTML tree -> MDAST — html5ever parses the HTML into a DOM, then element handlers transform each node into typed Markdown AST nodes. Whitespace is normalized during this phase.

  2. MDAST -> Markdown string — the AST is serialized with configurable formatting (heading style, bullet character, list indent, emphasis marker) and context-sensitive escaping.

The two phases are independent: the transformer knows nothing about string formatting, and the serializer knows nothing about HTML.

Optional features

Feature Description
tracing Enable debug/trace logging (zero-cost when disabled)
html2markdown = { version = "0.2", features = ["tracing"] }

Benchmarks

Throughput comparison (MiB/s, higher is better):

Fixture Rust html2md (Rust) Go hast (JS) turndown (JS)
article 68.5 58.3 29.3 4.1 15.8
table 21.1 17.9 21.7 1.8 ERR
lists 19.8 21.3 18.0 1.7 4.6
code 62.4 55.7 43.2 5.2 15.2
large 48.7 43.3 28.8 3.1 ERR

Measured on Apple M4 Max, Rust 1.93, Go 1.25, Node 22, macOS 15.7.

Reproduce:

cargo bench                  # Criterion benchmarks
./benches/compare.sh         # full cross-language comparison table

License

MIT