html-to-markdown-cli 2.19.1

Command-line interface for html-to-markdown - high-performance HTML to Markdown converter
html-to-markdown-cli-2.19.1 is not a library.

html-to-markdown

High-performance HTML → Markdown conversion powered by Rust. Shipping as a Rust crate, Python package, PHP extension, Ruby gem, Elixir Rustler NIF, Node.js bindings, WebAssembly, and standalone CLI with identical rendering behavior across all runtimes.

Key Features

  • Blazing Fast – Rust-powered core delivers 10-80× faster conversion than pure Python alternatives (150–280 MB/s)
  • Polyglot – Native bindings for Rust, Python, TypeScript/Node.js, Ruby, PHP, Go, Java, C#, and Elixir
  • Smart Conversion – Handles complex documents including nested tables, code blocks, task lists, and hOCR OCR output
  • Metadata Extraction – Extract document metadata (title, description, headers, links, images, structured data) alongside conversion
  • Visitor Pattern – Custom callbacks for domain-specific dialects, content filtering, URL rewriting, accessibility validation
  • Highly Configurable – Control heading styles, code block fences, list formatting, whitespace handling, and HTML sanitization
  • Tag Preservation – Keep specific HTML tags unconverted when markdown isn't expressive enough
  • Secure by Default – Built-in HTML sanitization prevents malicious content
  • Consistent Output – Identical markdown rendering across all language bindings

Try the Live Demo →

Installation

Each language binding provides comprehensive documentation with installation instructions, examples, and best practices. Choose your platform to get started:

Scripting Languages:

  • Python – PyPI package, metadata extraction, visitor pattern, CLI included
  • Ruby – RubyGems package, RBS type definitions, Steep checking
  • PHP – Composer package + PIE extension, PHP 8.2+, PHPStan level 9
  • Elixir – Hex package, Rustler NIF bindings, Elixir 1.19+

JavaScript/TypeScript:

  • Node.js / TypeScript – Native NAPI-RS bindings for Node.js/Bun, fastest performance, WebAssembly for browsers/Deno

Compiled Languages:

  • Go – Go module with FFI bindings, automatic library download
  • Java – Maven Central, Panama Foreign Function & Memory API, Java 24+
  • C# – NuGet package, .NET 8.0+, P/Invoke FFI bindings

Native:

  • Rust – Core library, flexible feature flags, zero-copy APIs

Command-Line:

  • CLI – Cross-platform binary via cargo install html-to-markdown-cli or Homebrew

Extract comprehensive metadata during conversion: title, description, headers, links, images, structured data (JSON-LD, Microdata, RDFa). Use cases: SEO extraction, table-of-contents generation, link validation, accessibility auditing, content migration.

Metadata Extraction Guide →

Customize HTML→Markdown conversion with callbacks for specific elements. Intercept links, images, headings, lists, and more. Use cases: domain-specific Markdown dialects (Obsidian, Notion), content filtering, URL rewriting, accessibility validation, analytics.

Visitor Pattern Guide →

Rust-powered core delivers 150–280 MB/s throughput (10-80× faster than pure Python alternatives). Includes benchmarking tools, memory profiling, streaming strategies, and optimization tips.

Performance Guide →

Keep specific HTML tags unconverted when Markdown isn't expressive enough. Useful for tables, SVG, custom elements, or when you need mixed HTML/Markdown output.

See language-specific documentation for preserveTags configuration.

Built-in HTML sanitization prevents XSS attacks and malicious content. Powered by ammonia with safe defaults. Configurable via sanitize options.

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines on:

  • Setting up the development environment
  • Running tests locally (Rust 95%+ coverage, language bindings 80%+)
  • Submitting pull requests
  • Reporting issues

All contributions must follow code quality standards enforced via pre-commit hooks (prek).

License

MIT License – see LICENSE for details. You can use html-to-markdown freely in both commercial and closed-source products with no obligations, no viral effects, and no licensing restrictions.