Skip to main content

Module parser

Module parser 

Source
Expand description

PDF parsing module.

Structs§

Column
A detected column in the page layout.
DetectedTable
A detected table region with its content.
FontStatistics
Font statistics for heading detection.
LayoutAnalyzer
Layout analyzer for extracting structured text from PDF pages.
ParseOptions
Options for parsing PDF documents.
PdfParser
PDF document parser.
TableDetector
Detects tables in a list of text spans.
TableDetectorConfig
Table detector configuration.
TableRowData
A row of text spans in a table.
TextBlock
A text block (paragraph, heading, etc.).
TextLine
A text line composed of multiple spans on the same baseline.
TextSpan
A text span with position and style information.

Enums§

BlockType
Type of text block.
ErrorMode
Error handling mode during parsing.
ExtractMode
What content to extract from the document.