Expand description
Error-tolerant HTML parser.
This module implements an error-tolerant HTML 4.01 parser, similar to
libxml2’s HTMLparser.c. Unlike the strict XML parser, this parser handles
common HTML patterns that are technically malformed:
- Missing closing tags (auto-closed based on HTML content model rules)
- Unquoted attribute values (
<div class=main>) - Void elements that never need closing (
<br>,<img>,<hr>, etc.) - Case-insensitive tag name matching
- Bare
&characters (not just&) - Missing doctype
- Boolean attributes without values (
<input disabled>)
The parser produces the same Document tree structure as the XML parser.
§Examples
use xmloxide::html::parse_html;
let doc = parse_html("<p>Hello <b>world</b>").unwrap();
let root = doc.root_element().unwrap();
assert_eq!(doc.node_name(root), Some("html"));Modules§
- entities
- HTML named character references.
Structs§
- Html
Parse Options - Options controlling HTML parser behavior.
Functions§
- parse_
html - Parses an HTML string into a
Documentwith default options. - parse_
html_ with_ options - Parses an HTML string into a
Documentwith the given options.