markdown-ppp
markdown-ppp is a feature-rich, flexible, and lightweight Rust library for parsing and processing Markdown documents.
It provides a clean, well-structured Abstract Syntax Tree (AST) for parsed documents, making it suitable for pretty-printing, analyzing, transforming, or rendering Markdown.
✨ Features
- Markdown Parsing — Full Markdown parsing support with strict AST structure.
- Pretty-printing and processing — Build, modify, and reformat Markdown easily.
- Render to HTML — Convert Markdown AST to HTML.
- Render to LaTeX — Convert Markdown AST to LaTeX with configurable styles.
- AST Transformation — Comprehensive toolkit for modifying, querying, and transforming parsed documents.
- GitHub Alerts — Native support for GitHub-style markdown alerts ([!NOTE], [!TIP], [!WARNING], etc.).
- Modular design — You can disable parsing entirely and use only the AST types.
📦 Installation
Add the crate using Cargo:
If you want only the AST definitions without parsing functionality, disable default features manually:
[]
= { = "0.1.0", = false }
🛠 Usage
Parsing Markdown
The main entry point for parsing is the parse_markdown function, available at:
Example:
use parse_markdown;
use Document;
use Rc;
MarkdownParserState
The MarkdownParserState controls parsing behavior and can be customized.
You can create a default state easily:
use *;
let state = default;
Alternatively, you can configure it manually by providing a MarkdownParserConfig:
use *;
let config = default
.with_block_blockquote_behavior;
let ast = parse_markdown?;
This allows you to control how certain Markdown elements are parsed or ignored.
🧩 Customizing the parsing behavior
You can control how individual Markdown elements are parsed at a fine-grained level using the MarkdownParserConfig API.
Each element type (block-level or inline-level) can be configured with an ElementBehavior:
These behaviors can be set via builder-style methods on the config. For example, to skip parsing of thematic breaks and transform blockquotes:
use *;
use Block;
let config = default
.with_block_thematic_break_behavior
.with_block_blockquote_behavior;
let ast = parse_markdown?;
This mechanism allows you to override, filter, or completely redefine how each Markdown element is treated during parsing, giving you deep control over the resulting AST.
Registering custom parsers
You can also register your own custom block-level or inline-level parsers by providing parser functions via configuration. These parsers are executed before the built-in ones and can be used to support additional syntax or override behavior.
To register a custom block parser:
use *;
use Block;
use ;
use IResult;
let custom_block: CustomBlockParserFn = new;
let config = default
.with_custom_block_parser;
Similarly, to register a custom inline parser:
use *;
use Inline;
use ;
use IResult;
let custom_inline: CustomInlineParserFn = new;
let config = config.with_custom_inline_parser;
This extensibility allows you to integrate domain-specific syntax and behaviors into the Markdown parser while reusing the base logic and AST structure provided by markdown-ppp., filter, or completely redefine how each Markdown element is treated during parsing.
📄 AST structure
The complete Markdown Abstract Syntax Tree (AST) is defined inside the module markdown_ppp::ast.
The Document struct represents the root node, and from there you can traverse the full tree of blocks and inlines, such as headings, paragraphs, lists, emphasis, and more.
You can use the AST independently without the parsing functionality by disabling default features.
🔄 AST Transformation
The ast_transform module provides a comprehensive toolkit for modifying, querying, and transforming parsed Markdown documents. This feature is disabled by default and must be enabled via the ast-transform feature.
Quick Start
Enable the feature in your Cargo.toml:
[]
= { = "2.4.0", = ["ast-transform"] }
Then use the transformation API:
use parse_markdown;
use *;
let state = new;
let doc = parse_markdown.unwrap;
// Transform all text to uppercase
let doc = doc.transform_text;
// Remove empty elements and normalize whitespace
let doc = doc.remove_empty_text.normalize_whitespace;
Transformation Patterns
The module provides several powerful patterns for working with AST:
1. Convenience Methods - High-level transformations
use ;
let doc = doc
.transform_text
.transform_image_urls
.transform_link_urls
.remove_empty_paragraphs
.normalize_whitespace;
2. Visitor Pattern - Read-only analysis
use ;
let mut collector = LinkCollector ;
doc.visit_with;
println!;
3. Query API - Find elements by conditions
use Query;
// Find all autolinks
let autolinks = doc.find_all_inlines;
// Count code blocks
let code_count = doc.count_blocks;
// Find first heading
let first_heading = doc.find_first_block;
4. Custom Transformers - Advanced modifications
use ;
;
let doc = doc.transform_with;
5. Pipeline Builder - Complex transformations
use TransformPipeline;
let result = new
.transform_text
.transform_image_urls
.when
.normalize_whitespace
.remove_empty_paragraphs
.apply;
Available Transformations
- Text transformations:
transform_text,transform_code,transform_html - URL transformations:
transform_image_urls,transform_link_urls,transform_autolink_urls - Filtering:
remove_empty_paragraphs,remove_empty_text,filter_blocks - Normalization:
normalize_whitespace - Custom:
transform_with,transform_if
🖨️ Pretty-printing (AST → Markdown)
You can convert an AST (Document) back into a formatted Markdown string using the render_markdown function from the printer module.
This feature is enabled by default via the printer feature.
Basic example
use render_markdown;
use Config;
use Document;
// Assume you already have a parsed or constructed Document
let document = default;
// Render it back to a Markdown string with default configuration
let markdown_output = render_markdown;
println!;
This will format the Markdown with a default line width of 80 characters.
Customizing output width
You can control the maximum width of lines in the generated Markdown by customizing the Config:
use render_markdown;
use Config;
use Document;
// Set a custom maximum width, e.g., 120 characters
let config = default.with_width;
let markdown_output = render_markdown;
println!;
This is useful if you want to control wrapping behavior or generate more compact or expanded Markdown documents.
🖨️ Pretty-printing (AST → HTML)
You can convert an AST (Document) back into a formatted HTML string using the render_html function from the html_printer module.
This feature is enabled by default via the html-printer feature.
Basic example
use render_html;
use Config;
use Document;
let config = default;
let ast = crateparse_markdown
.unwrap;
println!;
📄 LaTeX Rendering (AST → LaTeX)
You can convert an AST (Document) into LaTeX format using the render_latex function from the latex_printer module.
This feature is disabled by default and must be enabled via the latex-printer feature.
Basic example
use render_latex;
use Config;
use *;
let doc = Document ;
let config = default;
let latex_output = render_latex;
println!;
Configuration Options
The LaTeX printer supports various configuration options for different output styles:
Table Styles
use ;
// Use booktabs for professional tables
let config = default.with_table_style;
// Use longtabu for tables that span multiple pages
let config = default.with_table_style;
Code Block Styles
use ;
// Use minted for syntax highlighting (requires minted package)
let config = default.with_code_block_style;
// Use listings package for code blocks
let config = default.with_code_block_style;
Custom Width
let config = default.with_width;
let latex_output = render_latex;
🔧 Optional features
| Feature | Description |
|---|---|
parser |
Enables Markdown parsing support. Enabled by default. |
printer |
Enables AST → Markdown string conversion. Enabled by default. |
html-printer |
Enables AST → HTML string conversion. Enabled by default. |
latex-printer |
Enables AST → LaTeX string conversion. Disabled by default. |
ast-transform |
Enables AST transformation, query, and visitor functionality. Disabled by default. |
ast-serde |
Adds Serialize and Deserialize traits to all AST types via serde. Disabled by default. |
If you only need the AST types without parsing functionality, you can add the crate without default features:
If you want to disable Markdown generation (AST → Markdown string conversion), disable the printer feature manually:
To enable LaTeX output support:
📚 Documentation
📝 License
Licensed under the MIT License.