docspec-html-reader
Streaming HTML to DocSpec event stream reader.
See the main DocSpec repository for documentation, architecture, and the event protocol.
Supported Elements
- Paragraphs (
<p>) - Emits exactly:
StartDocument,StartParagraph,Text,EndParagraph,EndDocument
Out of Scope (silently dropped)
All other HTML elements are silently ignored. Text content inside inline elements
(e.g., <strong>, <em>) is preserved as Text events, but the formatting
structure is dropped.
Streaming Guarantee
HtmlReader streams its source via html5gum::IoReader's 16 KB sliding-window
buffer. Memory usage is constant regardless of document size — the document need
not fit in memory. Both from_str and from_reader use this streaming path
internally.
Quick Start
use ;
let mut reader = from_str;
while let Some = reader.next_event?
# Ok::
From a file or any Read + Seek source:
use File;
use ;
let file = open?;
let mut reader = from_reader?;
while let Some = reader.next_event?
# Ok::
See Also
- MANIFESTO.md — philosophy and values
- ARCHITECTURE.md — pipeline design, event model decisions, and pointers to the in-code event reference
docspec_coreon docs.rs — every event variant, field, and well-formedness rule