[−][src]Crate html_parser
Html parser
WIP - work in progress, use at your own risk
A simple and general purpose html parser, using Pest.
What is it not
- It's not a high-performance browser-grade parser
- It's not 100% complient with html
- It's not a parser that includes node selection and dom manipulation
If your requirements matches any of the above, then you're most likely looking for crates described below:
Features
- Parse html document
- Parse html fragments
- Parse custom, non-standard, elements
- Doesn't include comments in the AST
- Removed dangling elements
Examples
Parse html document
use html_parser::HtmlParser; fn main() { let html = r#" <!doctype html> <html lang="en"> <head> <meta charset="utf-8"> <title>Html parser</title> </head> <body> <h1 id="a" class="b c">Hello world</h1> </h1> <!-- dangling nodes are removed --> </body> </html>"#; assert!(HtmlParser::parse(html).is_ok()); }
Parse html fragment
use html_parser::HtmlParser; fn main() { let html = "<div id=cat />"; assert!(HtmlParser::parse(html).is_ok()); }
Print to json
use html_parser::{HtmlParser, Result}; fn main() -> Result<()> { let html = "<div id=cat />"; let json = HtmlParser::parse(html)?.to_json_pretty()?; println!("{}", json); Ok(()) }
Contributions
I would love to get some feedback if you find my little project useful. Please feel free to highlight issues with my code or submit a PR in case you want to improve it.
Structs
Element | |
HtmlParser |
Enums
AstVariant | |
ElementVariant | |
Node |
Type Definitions
Result |
|