supermarkdown
High-performance HTML to Markdown converter with full GitHub Flavored Markdown support. Written in Rust, available for Node.js and as a native Rust crate.
Features
- Fast - Written in Rust with O(n) algorithms, significantly faster than JavaScript alternatives
- Full GFM Support - Tables with alignment, strikethrough, autolinks, fenced code blocks
- Accurate - Handles malformed HTML gracefully via html5ever
- Configurable - Multiple heading styles, link styles, custom selectors
- Zero Dependencies - Single native binary, no JavaScript runtime overhead
- Cross-Platform - Pre-built binaries for Windows, macOS, and Linux (x64 & ARM64)
- TypeScript Ready - Full type definitions included
- Async Support - Non-blocking conversion for large documents
Installation
Quick Start
import from "@vakra-dev/supermarkdown";
const html = `
<h1>Hello World</h1>
<p>This is a <strong>test</strong> with a <a href="https://example.com">link</a>.</p>
`;
const markdown = ;
console.log;
// # Hello World
//
// This is a **test** with a [link](https://example.com).
Usage
Basic Conversion
import from "@vakra-dev/supermarkdown";
const markdown = ;
With Options
import from "@vakra-dev/supermarkdown";
const markdown = ;
Async Conversion
For large documents, use convertAsync to avoid blocking the main thread:
import from "@vakra-dev/supermarkdown";
const markdown = await ;
// Process multiple documents in parallel
const results = await Promise.;
API Reference
convert(html, options?)
Converts HTML to Markdown synchronously.
Parameters:
html(string) - The HTML string to convertoptions(object, optional) - Conversion options
Returns: string - The converted Markdown
convertAsync(html, options?)
Converts HTML to Markdown asynchronously.
Parameters:
html(string) - The HTML string to convertoptions(object, optional) - Conversion options
Returns: Promise - The converted Markdown
Options
| Option | Type | Default | Description |
|---|---|---|---|
headingStyle |
'atx' | 'setext' |
'atx' |
ATX uses # prefix, Setext uses underlines |
linkStyle |
'inline' | 'referenced' |
'inline' |
Inline: [text](url), Referenced: [text][1] |
codeFence |
'`' | '~' |
'`' |
Character for fenced code blocks |
bulletMarker |
'-' | '*' | '+' |
'-' |
Character for unordered list items |
baseUrl |
string |
undefined |
Base URL for resolving relative links |
excludeSelectors |
string[] |
[] |
CSS selectors for elements to exclude |
includeSelectors |
string[] |
[] |
CSS selectors to force keep (overrides excludes) |
Supported Elements
Block Elements
| HTML | Markdown |
|---|---|
<h1> - <h6> |
# headings or setext underlines |
<p> |
Paragraphs with blank lines |
<blockquote> |
> quoted blocks (supports nesting) |
<ul>, <ol> |
- or 1. lists (supports start attribute) |
<pre><code> |
Fenced code blocks with language detection |
<table> |
GFM tables with alignment and captions |
<hr> |
--- horizontal rules |
<dl>, <dt>, <dd> |
Definition lists |
<details>, <summary> |
Collapsible sections |
<figure>, <figcaption> |
Images with captions |
Inline Elements
| HTML | Markdown |
|---|---|
<a> |
[text](url), [text][ref], or <url> (autolink) |
<img> |
 |
<strong>, <b> |
**bold** |
<em>, <i> |
*italic* |
<code> |
`code` (handles nested backticks) |
<del>, <s>, <strike> |
~~strikethrough~~ |
<sub> |
<sub>subscript</sub> |
<sup> |
<sup>superscript</sup> |
<br> |
Line breaks |
HTML Passthrough
Elements without Markdown equivalents are preserved as HTML:
<kbd>- Keyboard input<mark>- Highlighted text<abbr>- Abbreviations (preservestitleattribute)<samp>- Sample output<var>- Variables
Advanced Features
Table Alignment
Extracts alignment from align attribute or text-align style:
Left
Center
Right
Output:
Ordered List Start
Respects the start attribute on ordered lists:
Fifth item
Sixth item
Output:
5. 6.
Autolinks
When a link's text matches its URL or email, autolink syntax is used:
https://example.com
test@example.com
Output:
<https://example.com>
<test@example.com>
Code Block Language Detection
Automatically detects language from class names:
language-*(e.g.,language-rust)lang-*(e.g.,lang-python)highlight-*(e.g.,highlight-go)hljs-*(highlight.js classes, excluding token classes likehljs-keyword)- Bare language names (e.g.,
javascript,python) as fallback
fn main() {}
Output:
```rust
fn main() {}
```
Code blocks containing backticks automatically use more backticks as delimiters.
Line Number Handling
Line number gutters are automatically stripped from code blocks. Elements with these class patterns are skipped:
gutterline-numberline-numberslinenolinenumber
URL Encoding
Spaces and parentheses in URLs are automatically percent-encoded:
// <a href="https://example.com/path (1)">link</a>
// → [link](https://example.com/path%20%281%29)
Selector-Based Filtering
Remove unwanted elements like navigation, ads, or sidebars:
const markdown = ;
Limitations
Some HTML features cannot be fully represented in Markdown:
| Feature | Behavior |
|---|---|
| Table colspan/rowspan | Content placed in first cell |
| Nested tables | Inner tables converted inline |
| Form elements | Skipped |
| iframe/video/audio | Skipped (no standard Markdown equivalent) |
| CSS styling | Ignored (except text-align for tables) |
| Empty elements | Removed from output |
Rust Usage
Add to your Cargo.toml:
[]
= "0.0.2"
use ;
// Basic conversion
let markdown = convert;
// With options
let options = new
.heading_style
.exclude_selectors;
let markdown = convert_with_options;
Performance
supermarkdown is designed for high performance:
- Single-pass parsing - O(n) HTML traversal
- Pre-computed metadata - List indices and CSS selectors computed in one pass
- Zero-copy where possible - Minimal string allocations
- Native code - No JavaScript runtime overhead
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
# Clone the repository
# Run tests
# Build Node.js bindings
License
MIT License - see LICENSE for details.