Skip to main content

Module layout_utils

Module layout_utils 

Source
Expand description

Layout parsing utilities.

This module provides utilities for layout analysis, including sorting layout boxes and associating OCR results with layout regions. The implementation follows established approaches.

Structs§

LayoutBox
Layout box with bounding box and label for sorting/processing purposes.
LayoutOCRAssociation
Result of associating OCR boxes with layout regions.
OverlapRemovalResult
Result of overlap removal.

Functions§

associate_ocr_with_layout
Associate OCR results with layout regions.
combine_rectangles_kmeans
Combines rectangles into at most target_n rectangles using KMeans-style clustering on box centers.
get_overlap_boxes_idx
Get indices of OCR boxes that overlap with layout regions.
get_overlap_removal_indices
Removes overlapping layout blocks, returning only indices to remove.
reconcile_table_cells
Reconciles structure recognition cells with detected cells.
remove_overlap_blocks
Removes overlapping layout blocks based on overlap ratio threshold.
reprocess_table_cells_with_ocr
Reprocesses detected table cell boxes using OCR boxes to better match the structure model’s expected cell count.
sort_layout_boxes
Sort layout boxes in reading order with column detection.