rushdown
A markdown parser written in Rust. Fast, Easy to extend, Standards-compliant.
rushdown is compliant with CommonMark 0.31.2 & GitHub Flavored Markdown[^gfm-support].
[^gfm-support]: rushdown does not support Disallowed Raw HTML.
Motivation
I needed a Markdown parser that met the following requirements:
- Written in Rust
- Compliant with CommonMark
- Fast
- Extensible from the outside of the crate
- AST-based
In short, I wanted something like goldmark written in Rust. However, no existing library satisfied these requirements.
Features
- Standards-compliant. rushdown is fully compliant with the latest CommonMark specification.
- Extensible. Do you want to add a
@usernamemention syntax to Markdown? You can easily do so in rushdown. You can add your AST nodes, parsers for block-level elements, parsers for inline-level elements, transformers for paragraphs, transformers for the whole AST structure, and renderers. - Performance. rushdown is one of the fastest CommonMark parser Rust implementations compared to pulldown-cmark, comrak, and markdown-rs.
- Robust. rushdown is tested with
cargo fuzz. - Built-in extensions. rushdown ships with GFM extensions.
Benchmark
You can run this benchmark by make bench
rushdown builds a clean, extensible AST structure, achieves full compliance with CommonMark, all while being one of the fastest CommonMark parser implementation written in Rust.
rushdown-cached time: 3.1845 ms
rushdown time: 3.3427 ms
markdown-rs time: 89.692 ms
comrak time: 4.2451 ms
pulldown-cmark time: 6.0037 ms
cmark time: 3.6439 ms
goldmark time: 5.6161 ms
Security
By default, rushdown does not render raw HTML or potentially-dangerous URLs. If you need to gain more control over untrusted contents, it is recommended that you use an HTML sanitizer such as ammonia.
Installation
Add dependency to your Cargo.toml:
[]
= "x.y.z"
CommonMark defines that parsers should handle HTML entities correctly. But this requires a large map that maps entity names to their corresponding Unicode code points. If you don't need this feature, you can disable it by adding the following line to your Cargo.toml:
= { = "x.y.z", = false, = ["std"] }
In this case, the parser will only support numeric character references and some predefined entities (like &, <, >, ", etc).
rushdown can also be used in no_std environments. To enable this feature, add the following line to your Cargo.toml:
= { = "x.y.z", = false, = ["no-std"] }
Usage
Basic Usage
Render Markdown(CommonMark, without GFM) to HTML string:
use markdown_to_html_string;
let mut output = Stringnew;
let input = "# Hello, World!\n\nThis is a **Markdown** document.";
match markdown_to_html_string ;
Render Markdown with GFM extensions to HTML string:
use Write;
use ;
let markdown_to_html = new_markdown_to_html;
let mut output = Stringnew;
let input = "# Hello, World!\n\nThis is a ~~Markdown~~ document.";
match markdown_to_html
You can use subset of the GFM extensions:
use Write;
use ;
let markdown_to_html = new_markdown_to_html;
let mut output = Stringnew;
let input = "# Hello, World!\n\nThis is a **Markdown** document.";
match markdown_to_html
Parser options
| Option | Default value | Description |
|---|---|---|
attributes |
false |
Whether to parse attributes. |
auto_heading_ids |
false |
Whether to automatically generate heading IDs. |
without_default_parsers |
false |
Whether to disable default parsers. |
arena |
ArenaOptions::default() |
Options for the arena allocator. |
escaped_space |
false |
If true, a '' escaped half-space(0x20) will not trigger parsers. |
id_generator |
None(BasicNodeIdGenerator) |
An ID generator for generating node IDs. |
Currently only headings support attributes. Attributes are being discussed in the CommonMark forum. This syntax may possibly change in the future.
heading {#id .className attrName=attrValue}
============
Arena options
| Option | Default value | Description |
|---|---|---|
initial_size |
1024 |
The initial capacity of the arena. |
GFM Parser options
| Option | Default value | Description |
|---|---|---|
linkify |
LinkifyOptions::default() |
Options for linkify extension. |
Linkify options
| Option | Default value | Description |
|---|---|---|
allowed_protocols |
["http", "https", "ftp", "mailto"] |
A list of allowed protocols for linkification. |
url_scanner |
default function | A function that scans a string for URLs. |
www_scanner |
default function | A function that scans a string for www links. |
email_scanner |
default function | A function that scans a string for email addresses. |
HTML Renderer options
| Option | Default value | Description |
|---|---|---|
hard_wrap |
false |
Renders soft line breaks as hard line breaks (<br />). |
xhtml |
false |
Whether to render HTML in XHTML style. |
allows_unsafe |
false |
Whether to allow rendering raw HTML and potentially-dangerous URLs. |
escaped_space |
false |
Indicates that a '' escaped half-space(0x20) should not be rendered. |
attribute_filters |
default filters | A list of filters for rendering attributes as HTML tag attributes. |
Customize Task list item rendering
GFM does not define details how task list items should be rendered.
You can customize the rendering of task list items by implementing a function:
use ;
let markdown_to_html = new_markdown_to_html_string;
let input = r#"
- [ ] Item
- [x] Item
"#;
let mut output = Stringnew;
match markdown_to_html
AST
rushdown builds a clean AST structure that is easy to traverse and manipulate. The AST is built on top of an arena allocator, which allows for efficient memory management and fast node access.
Each node belongs to a specific type and kind.
- Node
- has a
type_data: node type(block or inline) specific data - has a
kind_data: node kind(e.g. Text, Paragraph) specific data - has a
parent,first_child,next_sibling... : relationships
- has a
These macros can be used to access node data.
matches_kind!- Helper macro to match kind data.as_type_data!- Helper macro to downcast type data.as_type_data_mut!- Helper macro to downcast mutable type data.as_kind_data!- Helper macro to downcast kind data.as_kind_data_mut!- Helper macro to downcast mutable kind data.matches_extension_kind!- Helper macro to match extension kind.as_extension_data!- Helper macro to downcast extension data.as_extension_data_mut!- Helper macro to downcast mutable extension data.
*kind* and *type* macros are defined for rushdown builtin nodes.
*extension* macros are defined for extension nodes.
Nodes are stored in an arena for efficient memory management and access.
Each node is identified by a NodeRef, which contains the index and unique ID of the node.
You can get and manipulate nodes using the Arena and its methods.
use *;
use ;
use Segment;
let mut arena = new;
let source = "Hello, World!";
let doc_ref = arena.new_node;
let paragraph_ref = arena.new_node;
let seg = new;
as_type_data_mut!.append_source_line;
let text_ref = arena.new_node;
paragraph_ref.append_child;
doc_ref.append_child;
assert_eq!;
assert_eq!;
assert_eq!;
Walkng the AST: You can not mutate the AST while walking it. If you want to mutate the AST, collect the node refs and mutate them after walking.
md_ast macro can be used to build AST more easily.
use Result;
use Error;
use ;
use *;
use md_ast;
use matches_kind;
let mut arena = default;
let doc_ref = md_ast!;
let mut target: = None;
walk.ok;
assert_eq!;
Extending rushdown
See tests/extension.rs and override_renderer.rs for examples of how to extend rushdown.
You can extend rushdown by implementing AST nodes, custom block/inline parsers, transformers, and renderers.
The key point of rushdown extensibility is 'dynamic parser/renderer constructor injection'.
You can add parsers and renderers like the following:
fn user_mention_parser_extension() -> impl ParserExtension {
ParserExtensionFn::new(|p: &mut Parser| {
p.add_inline_parser(
UserMentionParser::new,
NoParserOptions, // no options for this parser
PRIORITY_EMPHASIS + 100,
);
})
}
fn user_mention_html_renderer_extension<'cb, W>(
options: UserMentionOptions,
) -> impl RendererExtension<'cb, W>
where
W: TextWrite + 'cb,
{
RendererExtensionFn::new(move |r: &mut Renderer<'cb, W>| {
r.add_node_renderer(UserMentionHtmlRenderer::with_options, options);
})
}
UserMentionParser::new is a constructor function that returns a UserMentionParser instance. rushdown will call this function with the necessary arguments.
Parser/Transformer constructor function can take these arguments if needed, in any order:
rushdown::parser::Options- parser options defined by the user
Rc<RefCell<rushdown::parser::ContextKeyRegistry>>
HtmlRenderer constructor function can take these arguments if needed, in any order:
rushdown::renderer::html::Options- renderer options defined by the user
Rc<RefCell<rushdown::renderer::ContextKeyRegistry>>Rc<RefCell<rushdown::renderer::NodeKindRegistry>>
Extensions
- rushdown-footnote: A footnote extension for rushdown.
- rushdown-meta: A meta(YAML frontmatter) extension for rushdown.
- rushdown-emoji: An emoji extension for rushdown.
- rushdown-highlighting: A syntax highlight extension for rushdown.
- rushdown-diagram: A diagram visualization(e.g. MermaidJS) extension for rushdown.
- rushdown-fenced-div: Fenced div extension for rushdown markdown parser.
Donation
BTC: 1NEDSyUmo4SMTDP83JJQSWi1MvQUGGNMZB
Github sponsors also welcome.
License
MIT
Author
Yusuke Inuzuka