Layout4j

Layout Analysis

Utilize ONNX models to analyze the layout of PDF documents or images, identifying a collection of layout regions.
Employ the model yolov8n_layout_general6.
Supported categories: ["Text", "Title", "Figure", "Table", "Caption", "Equation"].

Region Sorting

Sort the layout regions according to human reading order to extract structured text content.

Implementation Approach

The sorting algorithm, based on the positions of text blocks, identifies and organizes the hierarchical relationships between different regions (e.g., titles, text). By efficiently sorting and filtering the regions, the algorithm can recognize associations between columns, titles, and text in complex document layouts, ultimately generating a structured layout tree.

Core Features

Remove small overlapping regions.
Sort text blocks by Y-coordinate, prioritizing vertical layouts.
Build a multi-level layout structure containing text blocks based on title recognition.
Handle multi-column layouts by classifying text blocks into columns.
Process and store regions (e.g., titles, text blocks) along with their hierarchical relationships.

Reference Projects

RapidLayout

A collection of open-source layout analysis models.

GapTree_Sort_Algorithm

This algorithm is suitable only for standard horizontal or vertical reading formats and cannot handle non-standard layouts such as newspapers.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
src		src
.gitignore		.gitignore
README.md		README.md
README_zh_CN.md		README_zh_CN.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Layout4j

Layout Analysis

Region Sorting

Implementation Approach

Core Features

Reference Projects

About

Releases 1

Packages

Languages

CharleyXu/layout4j

Folders and files

Latest commit

History

Repository files navigation

Layout4j

Layout Analysis

Region Sorting

Implementation Approach

Core Features

Reference Projects

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages