Expand description
Page/line intermediate representation produced by the poppler XML parser and consumed by reconstruction. Coordinates are pdftohtml’s integer pixel units, top-left origin.
Structs§
- Fragment
- One
<text>fragment from pdftohtml, already a visual line or part of one. - Line
- A merged visual line (one or more fragments at the same height).
- Page
- Span
- A styled run of text within a line fragment.
Enums§
- Column
Mode - Column handling requested on the CLI.
- DocBlock
- A reconstructed, reading-ordered document block ready for XHTML emission.