Expand description
Extract Markdown code blocks from Markdown documents parsed with
pulldown-cmark.
pulldown-cmark already exposes the fenced code block info string through
Event::Start(Tag::CodeBlock(CodeBlockKind::Fenced(info_string))). This
crate builds on top of that lower-level event stream and returns complete,
ready-to-use code block records:
- fenced or indented block kind
- language parsed from the first info string word
- raw info string
- remaining attributes as a raw string or token iterator
- code block source text
- byte range covering the whole block
- zero-based line range covering the whole block
- indentation before the opening marker
§Example
use pulldown_cmark_codeblock::{code_blocks, CodeBlockKind};
let markdown = "# Title\n\n```rust runnable key=value\nfn main() {}\n```\n";
let block = code_blocks(markdown).next().unwrap();
assert!(matches!(block.kind, CodeBlockKind::Fenced(_)));
assert_eq!(block.language.as_deref(), Some("rust"));
assert_eq!(block.info_string, "rust runnable key=value");
assert_eq!(block.attributes.as_deref(), Some("runnable key=value"));
assert_eq!(block.attributes().collect::<Vec<_>>(), ["runnable", "key=value"]);
assert_eq!(block.source, "fn main() {}\n");
assert_eq!(block.line_range, 2..5);§API
Use code_blocks for the concise iterator API:
use pulldown_cmark_codeblock::code_blocks;
let markdown = "Before\n\n```rust\nfn main() {}\n```\n";
let blocks = code_blocks(markdown).collect::<Vec<_>>();
assert_eq!(blocks.len(), 1);
assert_eq!(blocks[0].language.as_deref(), Some("rust"));Use CodeBlockExtractor::from_markdown when you prefer constructing the
iterator explicitly:
use pulldown_cmark_codeblock::CodeBlockExtractor;
let markdown = "```rust\nfn main() {}\n```\n";
let blocks = CodeBlockExtractor::from_markdown(markdown).collect::<Vec<_>>();
assert_eq!(blocks[0].source, "fn main() {}\n");Each extracted CodeBlock exposes:
CodeBlock::kind:CodeBlockKind::FencedorCodeBlockKind::IndentedCodeBlock::language: first info string word for fenced blocksCodeBlock::info_string: complete fenced block info stringCodeBlock::attributes: remaining info string after the languageCodeBlock::source: code block bodyCodeBlock::byte_range: byte range covering opening marker, body, and closing markerCodeBlock::line_range: zero-based line range covering the whole blockCodeBlock::indent: whitespace indentation before the opening marker
It also provides helper methods:
use pulldown_cmark_codeblock::code_blocks;
let markdown = "```rust a b c\nfn main() {}\n```\n";
let block = code_blocks(markdown).next().unwrap();
assert!(block.is_fenced());
assert!(block.has_info_word("rust"));
assert!(block.has_attribute("b"));
assert_eq!(block.attributes().collect::<Vec<_>>(), ["a", "b", "c"]);Indented code blocks are also returned. They do not have an info string, language, or attributes.
use pulldown_cmark_codeblock::{code_blocks, CodeBlockKind};
let markdown = "Before\n\n indented\n\nAfter\n";
let block = code_blocks(markdown).next().unwrap();
assert!(matches!(block.kind, CodeBlockKind::Indented));
assert_eq!(block.language, None);
assert_eq!(block.info_string, "");
assert_eq!(block.attributes, None);
assert_eq!(block.source, "indented\n");Structs§
- Code
Block - A code block extracted from a Markdown document.
- Code
Block Extractor - Iterator over Markdown code blocks.
Enums§
- Code
Block Kind - Codeblock kind.
Functions§
- code_
blocks - Returns an iterator over Markdown code blocks.