pub struct MarkdownParser { /* private fields */ }
Expand description
A tree-sitter based markdown parser.
Provides structured parsing of markdown documents with heading hierarchy extraction, content block identification, and diagnostic reporting. The parser is designed to be resilient to malformed input while providing detailed structural information.
§Parsing Strategy
The parser uses tree-sitter’s markdown grammar to:
- Build a complete syntax tree of the document
- Walk the tree to identify heading nodes and their levels
- Extract content blocks between headings
- Build hierarchical table of contents structure
- Generate diagnostics for quality issues
§Reusability
Parser instances can be reused for multiple documents, but are not thread-safe. The internal tree-sitter parser maintains mutable state across parse operations.
§Memory Management
The parser automatically manages memory for syntax trees and intermediate structures.
Large documents may temporarily use significant memory during parsing, but this is
released after the parse()
method returns.
Implementations§
Source§impl MarkdownParser
impl MarkdownParser
Sourcepub fn new() -> Result<Self>
pub fn new() -> Result<Self>
Create a new markdown parser instance.
Initializes the tree-sitter parser with the markdown grammar. This operation may fail if the tree-sitter language cannot be loaded properly.
§Returns
Returns a new parser instance ready for use.
§Errors
Returns an error if:
- The tree-sitter markdown language cannot be loaded
- The parser cannot be initialized with the markdown grammar
- System resources are insufficient for parser creation
§Examples
use blz_core::{MarkdownParser, Result};
// Create a new parser
let mut parser = MarkdownParser::new()?;
// Parser is now ready to parse markdown content
let result = parser.parse("# Hello World\n\nContent here.")?;
assert!(!result.heading_blocks.is_empty());
§Resource Usage
Creating a parser allocates approximately 1-2MB of memory for the grammar and internal structures. This overhead is amortized across multiple parse operations.
Sourcepub fn parse(&mut self, text: &str) -> Result<ParseResult>
pub fn parse(&mut self, text: &str) -> Result<ParseResult>
Parse markdown text into structured components.
Performs complete analysis of the markdown document, extracting heading hierarchy, content blocks, table of contents, and generating diagnostics for any issues found.
§Arguments
text
- The markdown content to parse (UTF-8 string)
§Returns
Returns a ParseResult
containing:
- Structured heading blocks with content and line ranges
- Hierarchical table of contents
- Diagnostic messages for any issues found
- Line count and other metadata
§Errors
Returns an error if:
- The text cannot be parsed by tree-sitter (very rare)
- Memory is exhausted during parsing of extremely large documents
- Internal parsing structures cannot be built
Note: Most malformed markdown will not cause errors but will generate diagnostics.
§Examples
use blz_core::{MarkdownParser, Result};
let mut parser = MarkdownParser::new()?;
// Parse simple markdown
let result = parser.parse(r#"
This is an introduction section.
# Getting Started
Here's how to get started:
1. First step
2. Second step
## Prerequisites
You'll need these tools.
"#)?;
// Check the results
// The parser creates one block per heading with content until the next heading
assert!(result.heading_blocks.len() >= 2); // At least Introduction and Getting Started
assert!(!result.toc.is_empty());
// Line count represents total lines in the document
assert!(result.line_count > 0);
// Look for any parsing issues
for diagnostic in &result.diagnostics {
println!("{:?}: {}", diagnostic.severity, diagnostic.message);
}
§Performance Guidelines
- Documents up to 1MB: Parse in under 50ms
- Documents up to 10MB: Parse in under 500ms
- Very large documents: Consider streaming or chunking for better UX
§Memory Usage
Memory usage during parsing is approximately:
- Small documents (< 100KB): ~2x document size
- Large documents (> 1MB): ~1.5x document size
- Peak usage occurs during tree traversal and structure building
Auto Trait Implementations§
impl Freeze for MarkdownParser
impl RefUnwindSafe for MarkdownParser
impl Send for MarkdownParser
impl Sync for MarkdownParser
impl Unpin for MarkdownParser
impl UnwindSafe for MarkdownParser
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Downcast for Twhere
T: Any,
impl<T> Downcast for Twhere
T: Any,
Source§fn into_any(self: Box<T>) -> Box<dyn Any>
fn into_any(self: Box<T>) -> Box<dyn Any>
Box<dyn Trait>
(where Trait: Downcast
) to Box<dyn Any>
. Box<dyn Any>
can
then be further downcast
into Box<ConcreteType>
where ConcreteType
implements Trait
.Source§fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
Rc<Trait>
(where Trait: Downcast
) to Rc<Any>
. Rc<Any>
can then be
further downcast
into Rc<ConcreteType>
where ConcreteType
implements Trait
.Source§fn as_any(&self) -> &(dyn Any + 'static)
fn as_any(&self) -> &(dyn Any + 'static)
&Trait
(where Trait: Downcast
) to &Any
. This is needed since Rust cannot
generate &Any
’s vtable from &Trait
’s.Source§fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
&mut Trait
(where Trait: Downcast
) to &Any
. This is needed since Rust cannot
generate &mut Any
’s vtable from &mut Trait
’s.Source§impl<T> DowncastSync for T
impl<T> DowncastSync for T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more