Expand description
HTML text extractor (RFC-005 §5; RFC-044 §16.3 resource limits).
Strips HTML tags with a simple state-machine parser and preserves
visible text content. Block-level elements produce paragraph
boundaries. <h1>–<h6> headings populate heading_path.
Security: no JavaScript execution, no external resource loading, no DOM construction. Pure text extraction only (RFC-015 §15).