docspec-docx-reader
Streaming DOCX to DocSpec event stream reader.
See the main DocSpec repository for documentation, architecture, and the event protocol.
Supported
- Paragraphs (
<w:p>) and direct text (<w:t>inside<w:r>) - Emits exactly:
StartDocument,StartParagraph,Text,EndParagraph,EndDocument - Compression:
StoredandDeflatedonly
Out of Scope (silently dropped)
- Run styling (
<w:rPr>, bold, italic, underline, etc.) - Line and page breaks (
<w:br>) - Tabs (
<w:tab>) - Headings (any
<w:pStyle>value — every paragraph isStartParagraph) - Tables (
<w:tbl>,<w:tr>,<w:tc>) - Lists
- Hyperlinks (
<w:hyperlink>) - Drawings and images (
<w:drawing>,<w:pict>) - Structured document tags (
<w:sdt>) - Comments, footnotes, headers, footers
- Document metadata
- Tracked changes (
<w:ins>,<w:del>,<w:moveFrom>,<w:moveTo>)
Streaming Guarantee
DocxReader streams document.xml event by event using constant memory regardless
of document size. Only _rels/.rels (a few hundred bytes) is fully read into memory
to discover the document target path.
Quick Start
use ;
let mut reader = from_path?;
while let Some = reader.next_event?
# Ok::
See Also
- MANIFESTO.md — philosophy and values
- EVENTS.md — event types and well-formedness rules