rushdown 0.11.1

A 100% CommonMark-compatible GitHub Flavored Markdown parser and renderer
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
# rushdown
[![Tests](https://github.com/yuin/rushdown/actions/workflows/test.yml/badge.svg)](https://github.com/yuin/rushdown/actions/workflows/test.yml) [![Docs](https://docs.rs/rushdown/badge.svg)](https://docs.rs/rushdown) [![Crates.io](https://img.shields.io/crates/v/rushdown.svg?maxAge=2592000)](https://crates.io/crates/rushdown) ![Coverage](https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/yuin/3c122e76a86b680d04700e14b3161f04/raw/rushdown-coverage.json)

A markdown parser written in Rust. Fast, Easy to extend, Standards-compliant.

rushdown is compliant with CommonMark 0.31.2 & [GitHub Flavored Markdown](https://github.github.com/gfm/)[^gfm-support].

[^gfm-support]: rushdown does not support [Disallowed Raw HTML]https://github.github.com/gfm/#disallowed-raw-html-extension-. 

## Motivation
I needed a Markdown parser that met the following requirements:

- Written in Rust
- Compliant with CommonMark
- Fast
- Extensible from the outside of the crate
- AST-based

In short, I wanted something like [goldmark](https://github.com/yuin/goldmark) written in Rust. However, no existing library satisfied these requirements.

## Features
- **Standards-compliant.**  rushdown is fully compliant with the latest [CommonMark]https://commonmark.org/ specification.
- **Extensible.**  Do you want to add a `@username` mention syntax to Markdown?
  You can easily do so in rushdown. You can add your AST nodes,
  parsers for block-level elements, parsers for inline-level elements,
  transformers for paragraphs, transformers for the whole AST structure, and
  renderers.
- **Performance.**  rushdown is one of the fastest CommonMark parser Rust implementations compared to [pulldown-cmark]https://docs.rs/pulldown-cmark/latest/pulldown_cmark/, [comrak]https://docs.rs/comrak/latest/comrak/, and [markdown-rs]https://docs.rs/markdown/latest/markdown/.
- **Robust.**  rushdown is tested with `cargo fuzz`.
- **Built-in extensions.**  rushdown ships with GFM extensions.

## Benchmark
You can run this benchmark by `make bench`

rushdown builds a clean, extensible AST structure, achieves full compliance with CommonMark, all while being one of the fastest CommonMark parser implementation written in Rust.

```text
rushdown-cached         time: 3.1845 ms
rushdown                time: 3.3427 ms
markdown-rs             time: 89.692 ms
comrak                  time: 4.2451 ms
pulldown-cmark          time: 6.0037 ms
cmark                   time: 3.6439 ms
goldmark                time: 5.6161 ms
```

## Security
By default, rushdown does not render raw HTML or potentially-dangerous URLs.
If you need to gain more control over untrusted contents, it is recommended that you
use an HTML sanitizer such as [ammonia](https://docs.rs/ammonia/latest/ammonia/).

## Installation
Add dependency to your `Cargo.toml`:

```toml
[dependencies]
rushdown = "x.y.z"
```

CommonMark defines that parsers should handle [HTML entities](https://spec.commonmark.org/0.31.2/#entity-and-numeric-character-references) correctly. But this requires a large map that maps entity names to their corresponding Unicode code points. If you don't need this feature, you can disable it by adding the following line to your `Cargo.toml`:

```toml
rushdown = { version = "x.y.z", default-features = false, features = ["std"] }
```

In this case, the parser will only support numeric character references and some predefined entities (like `&`, `<`, `>`, `"`, etc).

rushdown can also be used in `no_std` environments. To enable this feature, add the following line to your `Cargo.toml`:

```toml
rushdown = { version = "x.y.z", default-features = false, features = ["no-std"] }
```

## Usage
### Basic Usage

Render Markdown(CommonMark, without GFM) to HTML string:

```rust
use rushdown::markdown_to_html_string;
let mut output = String::new();
let input = "# Hello, World!\n\nThis is a **Markdown** document.";
match markdown_to_html_string(&mut output, input) {
    Ok(_) => {
        println!("HTML output:\n{}", output);
    }
    Err(e) => {
    println!("Error: {:?}", e);
    }
 };
```

Render Markdown with GFM extensions to HTML string:

```rust
use core::fmt::Write;
use rushdown::{
    new_markdown_to_html,
    parser::{self, ParserExtension, GfmOptions},
    renderer::html::{self, RendererExtension},
    Result,
};

let markdown_to_html = new_markdown_to_html(
    parser::Options::default(),
    html::Options::default(),
    parser::gfm(GfmOptions::default()),
    html::NO_EXTENSIONS,
);
let mut output = String::new();
let input = "# Hello, World!\n\nThis is a ~~Markdown~~ document.";
match markdown_to_html(&mut output, input) {
    Ok(_) => {
        println!("HTML output:\n{}", output);
    }
    Err(e) => {
        println!("Error: {:?}", e);
    }
}
```

You can use subset of the GFM extensions:

```rust
use core::fmt::Write;
use rushdown::{
    new_markdown_to_html,
    parser::{self, ParserExtension},
    renderer::html::{self, RendererExtension},
    Result,
};

let markdown_to_html = new_markdown_to_html(
    parser::Options::default(),
    html::Options::default(),
    parser::gfm_table().and(parser::gfm_task_list_item()),
    html::NO_EXTENSIONS,
);
let mut output = String::new();
let input = "# Hello, World!\n\nThis is a **Markdown** document.";
match markdown_to_html(&mut output, input) {
    Ok(_) => {
        println!("HTML output:\n{}", output);
    }
    Err(e) => {
        println!("Error: {:?}", e);
    }
}
```


### Parser options

| Option | Default value | Description |
| --- | --- | --- |
| `attributes` | `false` | Whether to parse attributes. |
| `auto_heading_ids` | `false` | Whether to automatically generate heading IDs. |
| `without_default_parsers` | `false` | Whether to disable default parsers. |
| `arena` | `ArenaOptions::default()` | Options for the arena allocator. |
| `escaped_space` | `false` | If true, a '\' escaped half-space(0x20) will not trigger parsers. |
| `id_generator` | `None`(BasicNodeIdGenerator) | An ID generator for generating node IDs. |

Currently only headings support attributes.
Attributes are being discussed in the [CommonMark forum](https://talk.commonmark.org/t/consistent-attribute-syntax/272). This syntax may possibly change in the future.

```markdown
## heading ## {#id .className attrName=attrValue class="class1 class2"}

## heading {#id .className attrName=attrValue class="class1 class2"}

heading {#id .className attrName=attrValue}
============
```

#### Arena options

| Option | Default value | Description |
| --- | --- | --- |
| `initial_size` | `1024` | The initial capacity of the arena. |

### GFM Parser options

| Option | Default value | Description |
| --- | --- | --- |
| `linkify` | `LinkifyOptions::default()` | Options for linkify extension. |

#### Linkify options

| Option | Default value | Description |
| --- | --- | --- |
| `allowed_protocols` | `["http", "https", "ftp", "mailto"]` | A list of allowed protocols for linkification. |
| `url_scanner` | default function | A function that scans a string for URLs. |
| `www_scanner` | default function | A function that scans a string for www links. |
| `email_scanner` | default function | A function that scans a string for email addresses. |

### HTML Renderer options

| Option | Default value | Description |
| --- | --- | --- |
| `hard_wrap` | `false` | Renders soft line breaks as hard line breaks (`<br />`). |
| `xhtml` | `false` | Whether to render HTML in XHTML style. |
| `allows_unsafe` | `false` | Whether to allow rendering raw HTML and potentially-dangerous URLs. |
| `escaped_space` | `false` | Indicates that a '\' escaped half-space(0x20) should not be rendered. |
| `attribute_filters` | default filters | A list of filters for rendering attributes as HTML tag attributes. |

#### Customize Task list item rendering
[GFM](https://github.github.com/gfm/#task-list-items-extension-) does not define details how task list items should be rendered. 

You can customize the rendering of task list items by implementing a function:

```rust
use rushdown::{
    ast, new_markdown_to_html_string,
    parser::{self, GfmOptions},
    renderer,
    renderer::html,
};

let markdown_to_html = new_markdown_to_html_string(
    parser::Options::default(),
    html::Options::default(),
    parser::gfm(GfmOptions::default()),
    html::paragraph_renderer(html::ParagraphRendererOptions {
        render_task_list_item: Some(Box::new(
            |w: &mut String,
             pr: &html::ParagraphRenderer<String>,
             source: &str,
             arena: &ast::Arena,
             node_ref: ast::NodeRef,
             ctx: &mut renderer::Context| {
                // do stuff
                Ok(())
            },
        )),
        ..Default::default()
    }),
);
let input = r#"
- [ ] Item
- [x] Item
"#;
let mut output = String::new();
match markdown_to_html(&mut output, input) {
    Ok(_) => {
        println!("HTML output:\n{}", output);
    }
    Err(e) => {
        println!("Error: {:?}", e);
    }
}
```

## AST
rushdown builds a clean AST structure that is easy to traverse and manipulate. The AST is built on top of an arena allocator, which allows for efficient memory management and fast node access.

Each node belongs to a specific type and kind.

- Node
   - has a `type_data`: node type(block or inline) specific data
   - has a `kind_data`: node kind(e.g. Text, Paragraph) specific data
   - has a `parent`, `first_child`, `next_sibling`... : relationships

These macros can be used to access node data.

- `matches_kind!` - Helper macro to match kind data.
- `as_type_data!` - Helper macro to downcast type data.
- `as_type_data_mut!` - Helper macro to downcast mutable type data.
- `as_kind_data!` - Helper macro to downcast kind data.
- `as_kind_data_mut!` - Helper macro to downcast mutable kind data.
- `matches_extension_kind!` - Helper macro to match extension kind.
- `as_extension_data!` - Helper macro to downcast extension data.
- `as_extension_data_mut!` - Helper macro to downcast mutable extension data.

`*kind*` and `*type*` macros are defined for rushdown builtin nodes.
`*extension*` macros are defined for [extension](#extending-rushdown) nodes.

Nodes are stored in an arena for efficient memory management and access.
Each node is identified by a `NodeRef`, which contains the index and unique ID of the node.

You can get and manipulate nodes using the `Arena` and its methods.

```rust
use rushdown::ast::*;
use rushdown::{as_type_data_mut, as_type_data, as_kind_data};
use rushdown::text::Segment;

let mut arena = Arena::new();
let source = "Hello, World!";
let doc_ref = arena.new_node(Document::new());
let paragraph_ref = arena.new_node(Paragraph::new());
let seg = Segment::new(0, source.len());
as_type_data_mut!(&mut arena[paragraph_ref], Block).append_source_line(seg);
let text_ref = arena.new_node(Text::new(seg));
paragraph_ref.append_child(&mut arena, text_ref);
doc_ref.append_child(&mut arena, paragraph_ref);

assert_eq!(arena[paragraph_ref].first_child().unwrap(), text_ref);
assert_eq!(
    as_kind_data!(&arena[text_ref], Text).str(source),
    "Hello, World!"
);
assert_eq!(
    as_type_data!(&arena[paragraph_ref], Block)
        .source()
        .first()
        .unwrap()
        .str(source),
    "Hello, World!"
);
```

Walkng the AST: You can not mutate the AST while walking it. If you want to mutate the AST, collect the node refs and mutate them after walking.

`md_ast` macro can be used to build AST more easily.

```rust
use core::result::Result;
use core::error::Error;
use core::fmt::{self, Display, Formatter};
use rushdown::ast::*;
use rushdown::md_ast;
use rushdown::matches_kind;

#[derive(Debug)]
enum UserError { SomeError(&'static str) }

impl Error for UserError {}

impl Display for UserError {
    fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
        match self { UserError::SomeError(msg) => write!(f, "UserError: {}", msg) }
    }
}

let mut arena = Arena::default();
let doc_ref = md_ast!(&mut arena, Document::new() => {
    Blockquote::new() => {
        Paragraph::new(); { |node: &mut Node| {
            node.attributes_mut().insert("class", "paragraph".into());
        } } => {
            Text::new("Hello, World!")
        },
        Paragraph::new() => {
            Text::new("This is a test.")
        }
    }
});

let mut target: Option<NodeRef> = None;

walk(&arena, doc_ref, &mut |arena: &Arena,
                            node_ref: NodeRef,
                            entering: bool| -> Result<WalkStatus, UserError > {
    if entering {
        if let Some(fc) = arena[node_ref].first_child() {
            if let KindData::Text(t) = &arena[fc].kind_data() {
                if t.str("").contains("test") {
                    target = Some(node_ref);
                }
                if t.str("").contains("error") {
                    return Err(UserError::SomeError("Some error occurred"));
                }
            }
        }
    }
    Ok(WalkStatus::Continue)
}).ok();
assert_eq!(target, Some(arena[arena[doc_ref].first_child().unwrap()].last_child().unwrap()) );
```


## Extending rushdown <a name="extending-rushdown"></a>
See `tests/extension.rs` and `override_renderer.rs` for examples of how to extend rushdown.

You can extend rushdown by implementing AST nodes, custom block/inline parsers, transformers, and renderers.

The key point of rushdown extensibility is 'dynamic parser/renderer constructor injection'.

You can add parsers and renderers like the following:

```text
fn user_mention_parser_extension() -> impl ParserExtension {
    ParserExtensionFn::new(|p: &mut Parser| {
        p.add_inline_parser(
            UserMentionParser::new,
            NoParserOptions, // no options for this parser
            PRIORITY_EMPHASIS + 100,
        );
    })
}

fn user_mention_html_renderer_extension<'cb, W>(
    options: UserMentionOptions,
) -> impl RendererExtension<'cb, W>
where
    W: TextWrite + 'cb,
{
    RendererExtensionFn::new(move |r: &mut Renderer<'cb, W>| {
        r.add_node_renderer(UserMentionHtmlRenderer::with_options, options);
    })
}
```

`UserMentionParser::new` is a constructor function that returns a `UserMentionParser` instance. rushdown will call this function with the necessary arguments.

Parser/Transformer constructor function can take these arguments if needed, in any order:

- `rushdown::parser::Options`
- parser options defined by the user
- `Rc<RefCell<rushdown::parser::ContextKeyRegistry>>`

HtmlRenderer constructor function can take these arguments if needed, in any order:

- `rushdown::renderer::html::Options`
- renderer options defined by the user
- `Rc<RefCell<rushdown::renderer::ContextKeyRegistry>>`
- `Rc<RefCell<rushdown::renderer::NodeKindRegistry>>`

## Extensions

- [rushdown-footnote]https://crates.io/crates/rushdown-footnote: A footnote extension for rushdown.
- [rushdown-meta]https://crates.io/crates/rushdown-meta: A meta(YAML frontmatter) extension for rushdown.
- [rushdown-emoji]https://crates.io/crates/rushdown-emoji: An emoji extension for rushdown.
- [rushdown-highlighting]https://crates.io/crates/rushdown-highlighting: A syntax highlight extension for rushdown.
- [rushdown-diagram]https://crates.io/crates/rushdown-diagram: A diagram visualization(e.g. MermaidJS) extension for rushdown.

## Donation
BTC: 1NEDSyUmo4SMTDP83JJQSWi1MvQUGGNMZB

Github sponsors also welcome.

## License
MIT

## Author
Yusuke Inuzuka