markdown-syntax
A no_std + alloc Rust crate that parses Markdown source into an owned AST and serializes the AST back to canonical Markdown — with opt-in, safe-by-default HTML rendering behind the html feature.
At a glance
- AST-first —
parsereturns an owned enum tree overalloc; the output verbs live on theDocumentyou hold. - Tolerant — problems are collected as diagnostics, never thrown;
parseis infallible. - Maximal default dialect — GFM + footnotes + math + frontmatter + wikilinks + directives + extra inline marks, out of the box.
- Lean core — zero runtime dependencies,
no_std + alloc, MSRV 1.82.
Install
cargo add markdown-syntax
For the opt-in HTML renderer:
cargo add markdown-syntax --features html
Or in Cargo.toml:
[]
= "0.1"
Quickstart
use parse;
// `parse` is infallible and returns a `ParseOutput { document, diagnostics }`.
let output = parse;
assert!;
// Serialize the AST back to canonical Markdown (this is the fallible step).
let markdown = output.document.to_markdown?;
assert_eq!;
# Ok::
parse is infallible — the output verbs live on the Document you hold.
Common tasks
parse is the one obvious path. When you need to narrow the dialect, read diagnostics, walk the tree, or render HTML, each task is one small snippet below.
- Pick a dialect (presets)
- Tune one construct (builder)
- Walk the AST
- Handle diagnostics (tolerant vs strict)
- Customize serialization
- Source positions (optional)
- Build an AST by hand
Pick a dialect (presets)
use SyntaxOptions;
// Named presets each build a `SyntaxOptions`; call `.parse` to run them.
let cm = commonmark.parse;
let gfm = gfm.parse;
let mdx = mdx.parse;
// `parse(input)` is exactly `SyntaxOptions::default().parse(input)` — the
// maximal non-MDX dialect.
let default = default.parse;
let _ = ;
commonmark / gfm / mdx are the named presets; default == the maximal non-MDX dialect, and parse(input) is sugar for SyntaxOptions::default().parse(input). See SyntaxOptions.
Tune one construct (builder)
use ;
// Tune a preset with the typo-proof `Construct` builder (grouped constructs
// such as `Math`, `Footnotes`, `Directives` flip every flag in the group).
let no_math = default.disable.parse;
let with_wikilinks = commonmark
.enable
.enable
.parse;
let _ = ;
Construct is a typo-proof front door over the full Constructs flag set. Grouped constructs (Math, Footnotes, Directives) flip a whole family at once, and Wikilinks is the one parameterized variant.
Walk the AST
use ;
let document = parse.document;
for block in &document.children
document.children is a Vec<Block>; block content (like Paragraph.children) is a Vec<Inline>. See the ast module, Block, and Inline.
Handle diagnostics (tolerant vs strict)
use ;
// Tolerant parse: problems are collected, never thrown.
let output = default.parse;
for diagnostic in &output.diagnostics
// `parse_strict` promotes any error-severity diagnostic (or a config conflict)
// to a hard `Err`.
match default.parse_strict
span is Option<Span> because a hand-built node may lack a source location. Parser diagnostics, AST validation, and serializer/HTML pre-validation are three separate domains that share one Diagnostic type.
Customize serialization
use ;
// `SerializeOptions` is #[non_exhaustive]: mutate a default rather than using a
// struct literal.
let mut options = default;
options.line_ending = CrLf;
options.final_newline = false;
let markdown = parse.document.to_markdown_with?;
assert_eq!;
# Ok::
Because SerializeOptions is #[non_exhaustive], external code cannot struct-literal-construct it (even with ..Default::default(), E0639) — mutate a default() instead.
Source positions (optional)
use ;
let source = "# Title\n\nHello.";
let document = parse.document;
let index = new;
// Spans are absolute, half-open UTF-8 byte ranges; `None` for hand-built nodes.
if let Some = document.children.first
Spans are absolute half-open UTF-8 byte ranges, None for hand-built nodes. LineIndex turns a Span into 1-based LinePosition line/column.
Build an AST by hand
The prelude imports the common surface in one line:
use *;
let document = Document ;
// Hand-built nodes carry no span.
assert_eq!;
assert_eq!;
HTML rendering (opt-in)
The HTML renderer ships behind the non-default html feature and is safe by default: it validates the AST first, escapes raw HTML, blanks dangerous link/image protocols, and disables task-list checkboxes.
cargo add markdown-syntax --features html
// Requires `--features html`; the default doctest build has no html feature,
// so this block is `rust,ignore`.
use ;
let document = parse.document;
// Default is safe: raw HTML is escaped, dangerous link/image protocols blanked.
let safe: = document.to_html;
assert!;
// `HtmlOptions` is #[non_exhaustive]: mutate a default to opt into raw HTML.
let mut options = default;
options.allow_dangerous_html = true;
options.safe_raw_html_form = OmitPlaceholder;
let _ = document.to_html_with;
See HtmlOptions. docs.rs builds with the html feature enabled, so the renderer's API is fully documented there.
Dialects & constructs reference
| Preset | .parse builder |
Membership note |
|---|---|---|
commonmark |
SyntaxOptions::commonmark() |
CommonMark core only |
gfm |
SyntaxOptions::gfm() |
CommonMark + tables, task lists, strikethrough, autolinks, footnotes |
mdx |
SyntaxOptions::mdx() |
MDX JSX/expressions/ESM on; raw HTML off |
default (== max) |
SyntaxOptions::default() / parse |
Maximal non-MDX dialect (see below) |
underline (__text__) is off in default because it would override CommonMark strong; MDX is off by default and conflicts with raw HTML; wikilinks default to title-after-pipe. For the full Construct (~21 variants) and Constructs (~33 fields) surface, see Construct and Constructs on docs.rs.
Cargo features:
| Feature | Default | What it adds |
|---|---|---|
default |
[] (empty) |
Byte-stable no_std + alloc core: parser, AST, serializer, validation, Span/LineIndex, prelude. Zero runtime deps. |
html |
off | Opt-in, additive, safe-by-default to_html / to_html_with and the html module. Stays no_std + alloc, zero runtime deps. |
How it works
- AST-first public API —
parseproduces an ownedDocument; parser event streams and internal block operations are private, not v1 compatibility surfaces. - Owned enum tree over
alloctypes. - Optional source spans — half-open absolute byte ranges on every node,
Nonefor hand-built nodes; line/column derived viaLineIndex. - Tolerant by default — diagnostics are collected, not thrown.
Scope & limitations
In scope — the maximal default dialect: GFM (tables, task lists, strikethrough, literal/relaxed autolinks, alerts), footnotes (incl. inline), inline + block math, frontmatter (--- / +++), wikilinks (title-after-pipe default), the extra inline marks (insert ++, highlight ==, subscript ~, superscript ^, spoiler ||, shortcodes :tada:), description lists, and the :name / ::name / :::name directive family.
Non-goals:
underline(__text__) is off by default — it would override CommonMark strong.- MDX (JSX / expressions / ESM) is off by default and conflicts with raw HTML.
- Raw HTML and MDX are represented only as Markdown syntax nodes — no HTML rendering/sanitization, no MDX evaluation, no syntax highlighting, and no DOM post-processing in the default build.
- The serializer performs no HTML safety filtering and does not preserve byte-for-byte authoring style from a bare AST.
- Validation is conservative and does not prove every semantic invariant of a hand-written AST.
- Directives (
:name/::name/:::name) are a distinct family and are never MDX.
Compatibility
no_std + alloc (crate root is #![no_std] + extern crate alloc). Default features are empty; the opt-in html feature also stays no_std + alloc. Zero runtime dependencies. MSRV 1.82 (edition 2021).
Contributing & conformance
Tests live in tests/. AST→HTML correctness is measured against vendored CommonMark/GFM oracles; observe the current numbers with cargo test --features html --test html_conformance -- --nocapture.
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.