Skip to main content

Module chunks

Module chunks 

Source
Expand description

Chunk types — atomic units of extracted content.

Structs§

ImageChunk
Image bounding box — actual pixel data extracted at output time.
LineArtChunk
Vector graphic — collection of line segments forming bullets, decorations, etc.
LineChunk
Line segment — used for table border detection.
TextChunk
Atomic text fragment — one font run in the PDF content stream.

Constants§

LINE_ART_SIZE_EPSILON
Size comparison tolerance for line art classification.