-
Utilize ONNX models to analyze the layout of PDF documents or images, identifying a collection of layout regions.
-
Employ the model yolov8n_layout_general6.
-
Supported categories: ["Text", "Title", "Figure", "Table", "Caption", "Equation"].
Sort the layout regions according to human reading order to extract structured text content.
The sorting algorithm, based on the positions of text blocks, identifies and organizes the hierarchical relationships between different regions (e.g., titles, text). By efficiently sorting and filtering the regions, the algorithm can recognize associations between columns, titles, and text in complex document layouts, ultimately generating a structured layout tree.
- Remove small overlapping regions.
- Sort text blocks by Y-coordinate, prioritizing vertical layouts.
- Build a multi-level layout structure containing text blocks based on title recognition.
- Handle multi-column layouts by classifying text blocks into columns.
- Process and store regions (e.g., titles, text blocks) along with their hierarchical relationships.
A collection of open-source layout analysis models.
This algorithm is suitable only for standard horizontal or vertical reading formats and cannot handle non-standard layouts such as newspapers.