Expand description
Table detection: lattice, stream, and explicit strategies. Table detection types and pipeline.
This module provides the configuration types, data structures, and orchestration for detecting tables in PDF pages using Lattice, Stream, or Explicit strategies.
Structs§
- Cell
- A detected table cell.
- Explicit
Lines - User-provided line coordinates for Explicit strategy.
- Intersection
- An intersection point between horizontal and vertical edges.
- Table
- A detected table.
- Table
Finder - Orchestrator for the table detection pipeline.
- Table
Finder Debug - Intermediate results from the table detection pipeline.
- Table
Quality - Quality metrics for a detected table.
- Table
Settings - Configuration for table detection.
Enums§
- Strategy
- Strategy for table detection.
Functions§
- cells_
to_ tables - Group adjacent cells into distinct tables.
- edges_
to_ intersections - Find all intersection points between horizontal and vertical edges.
- explicit_
lines_ to_ edges - Convert user-provided explicit line coordinates into edges.
- extract_
text_ for_ cells - Extract text content for each cell by finding characters within the cell bbox.
- intersections_
to_ cells - Construct rectangular cells from a grid of intersection points.
- join_
edge_ group - Merge overlapping or adjacent collinear edge segments.
- snap_
edges - Snap nearby parallel edges to aligned positions.
- words_
to_ edges_ stream - Generate synthetic edges from text alignment patterns for the Stream strategy.