Expand description
Layout parsing utilities.
This module provides utilities for layout analysis, including sorting layout boxes and associating OCR results with layout regions. The implementation follows established approaches.
Structs§
- Layout
Box - Layout box with bounding box and label for sorting/processing purposes.
- LayoutOCR
Association - Result of associating OCR boxes with layout regions.
- Overlap
Removal Result - Result of overlap removal.
Functions§
- associate_
ocr_ with_ layout - Associate OCR results with layout regions.
- combine_
rectangles_ kmeans - Combines rectangles into at most
target_nrectangles using KMeans-style clustering on box centers. - get_
overlap_ boxes_ idx - Get indices of OCR boxes that overlap with layout regions.
- get_
overlap_ removal_ indices - Removes overlapping layout blocks, returning only indices to remove.
- reconcile_
table_ cells - Reconciles structure recognition cells with detected cells.
- remove_
overlap_ blocks - Removes overlapping layout blocks based on overlap ratio threshold.
- reprocess_
table_ cells_ with_ ocr - Reprocesses detected table cell boxes using OCR boxes to better match the structure model’s expected cell count.
- sort_
layout_ boxes - Sort layout boxes in reading order with column detection.