html2markdown
HTML to Markdown converter using AST-to-AST transformation.
Ports the architecture and test cases from hast-util-to-mdast (transformer) and mdast-util-to-markdown (serializer).
Usage
Add to your Cargo.toml:
[]
= "0.1"
let md = convert;
assert_eq!;
With options
use ;
let opts = Options ;
let md = convert_with;
assert_eq!;
What it handles
- Headings, paragraphs, blockquotes, lists (ordered, unordered, task lists)
- Inline formatting: bold, italic, strikethrough, code
- Links, images, and reference-style links
- Tables (with alignment)
- Code blocks (fenced) with language hints
- Horizontal rules, line breaks
- Nested structures and edge cases from 130 fixture tests
- Context-sensitive escaping to prevent false Markdown syntax
Architecture
The conversion is a two-phase pipeline:
-
HTML tree -> MDAST — html5ever parses the HTML into a DOM, then element handlers transform each node into typed Markdown AST nodes. Whitespace is normalized during this phase.
-
MDAST -> Markdown string — the AST is serialized with configurable formatting (heading style, bullet character, list indent, emphasis marker) and context-sensitive escaping.
The two phases are independent: the transformer knows nothing about string formatting, and the serializer knows nothing about HTML.
Optional features
| Feature | Description |
|---|---|
tracing |
Enable debug/trace logging (zero-cost when disabled) |
= { = "0.1", = ["tracing"] }
License
MIT