📓RoDLA

Benchmarking the Robustness of Document Layout Analysis Models (CVPR'24)

🏡 Project Homepage

This is the official repository for our CVPR 2024 paper RoDLA:Benchmarking the Robustness of Document Layout Analysis Models. For more result and benchmarking details, please visit our project homepage.

🔎 Introduction

We introduce RoDLA that aims to benchmark the robustness of Document Layout Analysis (DLA) models. RoDLA is a large-scale benchmark that contains 450,000+ documents with diverse layouts and contents. We also provide a set of evaluation metrics to facilitate the comparison of different DLA models. We hope that RoDLA can serve as a standard benchmark for the robustness evaluation of DLA models.

📝 Catalog

📦 Installation

1. Clone the repository

git clone https://github.com/yufanchen96/RoDLA.git
cd RoDLA

2. Create a conda virtual environment

# create virtual environment
conda create -n RoDLA python=3.7 -y
conda activate RoDLA

3. Install benchmark dependencies

Install Basic Dependencies

pip install torch==1.10.2+cu113 torchvision==0.11.3+cu113 torchaudio==0.10.2+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html
pip install -U openmim
mim install mmcv-full==1.5.0
pip install timm==0.6.11 mmdet==2.28.1
pip install Pillow==9.5.0
pip install opencv-python termcolor yacs pyyaml scipy

Install ocrodeg Dependencies

git clone https://github.com/NVlabs/ocrodeg.git
cd ./ocrodeg
pip install -e .

Compile CUDA operators

cd ./model/ops_dcnv3
sh ./make.sh
python test.py

You can also install the operator using .whl files

DCNv3-1.0-whl

📂 Dataset Preparation

RoDLA Benchmark Dataset Preparation

Download the RoDLA dataset from Google Driver to the desired root directory.

Self-generated Perturbation Dataset Preparation

Prepare the dataset as follows by yourself:

cd ./perturbation

python apply_perturbation.py \
      --dataset_dir ./publaynet/val \
      --json_dir ./publaynet/val.json \
      --dataset_name PubLayNet-P \
      --output_dir ./PubLayNet-P \
      --pert_method all \
      --background_folder ./background \
      --metric all

Dataset Structure

After dataset preparation, the perturbed dataset structure would be:

.desired_root
└── PubLayNet-P
    ├── Background
    │   ├── Background_1
    │   │   ├── psnr.json
    │   │   ├── ms_ssim.json
    │   │   ├── cw_ssim.json
    │   │   ├── val.json  
    │   │   ├── val
    │   │   │   ├── PMC538274_00004.jpg
    ...
    │   ├── Background_2
    ...
    ├── Rotation
    ...

🚀 Quick Start

Download the RoDLA model checkpoints

Evaluate the RoDLA model

cd ./model
python -u test.py configs/publaynet/rodla_internimage_xl_publaynet.py \
  checkpoint_dir/rodla_internimage_xl_publaynet.pth \
  --work-dir result/rodla_internimage_publaynet/Speckle_1 \
  --eval bbox \
  --cfg-options data.test.ann_file='PubLayNet-P/Speckle/Speckle_1/val.json' \
                data.test.img_prefix='PubLayNet-P/Speckle/Speckle_1/val/'

Training the RoDLA model

Modify the configuration file under configs/_base_/datasets to specify the dataset path
Run the following command to train the model with 4 GPUs

sh dist_train.sh configs/publaynet/rodla_internimage_xl_2x_publaynet.py 4

🌳 Citation

If you find this code useful for your research, please consider citing:

@inproceedings{chen2024rodla,
      title={RoDLA: Benchmarking the Robustness of Document Layout Analysis Models}, 
      author={Yufan Chen and Jiaming Zhang and Kunyu Peng and Junwei Zheng and Ruiping Liu and Philip Torr and Rainer Stiefelhagen},
      booktitle={CVPR},
      year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
model		model
perturbation		perturbation
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📓RoDLA

Benchmarking the Robustness of Document Layout Analysis Models (CVPR'24)

🏡 Project Homepage

🔎 Introduction

📝 Catalog

📦 Installation

📂 Dataset Preparation

RoDLA Benchmark Dataset Preparation

Self-generated Perturbation Dataset Preparation

Dataset Structure

🚀 Quick Start

Download the RoDLA model checkpoints

Evaluate the RoDLA model

Training the RoDLA model

🌳 Citation

About

Releases

Packages

Contributors 2

Languages

License

yufanchen96/RoDLA

Folders and files

Latest commit

History

Repository files navigation

📓RoDLA

Benchmarking the Robustness of Document Layout Analysis Models (CVPR'24)

🏡 Project Homepage

🔎 Introduction

📝 Catalog

📦 Installation

📂 Dataset Preparation

RoDLA Benchmark Dataset Preparation

Self-generated Perturbation Dataset Preparation

Dataset Structure

🚀 Quick Start

Download the RoDLA model checkpoints

Evaluate the RoDLA model

Training the RoDLA model

🌳 Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages