GitHub - 209-Tongji/OBSD: [ACL 2024 Best Paper] Deciphering Oracle Bone Language with Diffusion Models

OBSD: Deciphering Oracle Bone Language with Diffusion Models

Haisu Guan, Huanxin Yang, Xinyu Wang, Shengwei Han, Yongge Liu, Lianwen Jin, Xiang Bai, Yuliang Liu

News

2024.8.14 🚀 OBSD has been selected as the ACL 2024 Best Paper.
2024.7.15 🚀 OBSD has been selected as the ACL 2024 Oral.
2024.5.16 🚀 OBSD is accepted by ACL 2024 Main.
2024.2.15 🚀 Sourced code for OBSD is released.

Oracle Bone Script Decipher.

Welcome to OBSD. Paper has been published, and we show the Demo.
The dataset is available here.

In data/modern_kanji.zip, we also provide images of modern Chinese characters corresponding to Oracle Bone Script, and you can run the data/process.py to process the data if you wish to use your own data.

Data preparation

You can arbitrarily divide the training and test sets from the dataset and place them in the following format. The image names in the input folder and the target folder need to correspond one to one. The input folder stores OBS images, and the target folder stores modern Chinese character images.

Your_dataroot/
├── train/  (training set)
│   ├── input/
│   │   ├── train_安_1.png (OBS image)
│   │   ├── train_安_2.png 
│   │   ├── train_北_1.png
│   │   └── train_北_2.png
│   └── target/
│       ├── train_安_1.png (Modern Chinese Character image)
│       ├── train_安_2.png 
│       ├── train_北_1.png 
│       └── train_北_2.png 
│
└── test/   (test set)
    ├── input/
    │   ├── test_1.png  (OBS image)
    │   └── test_2.png
    └── target/
        ├── test_1.png  (Modern Chinese Character image)
        └── test_2.png

You also need to modify the following path to configs.yaml.

data:
    train_data_dir: '/Your_dataroot/train/' # path to directory of train data
    test_data_dir: '/Your_dataroot/test/'   # path to directory of test data
    test_save_dir: 'Your_project_path/OBS_Diffusion/result' # path to directory of test output
    val_save_dir: 'Your_project_path/OBS_Diffusion/validation/'    # path to directory of validation during training
    tensorboard: 'Your_project_path/OBS_Diffusion/logs' # path to directory of training information

training:
    resume: '/Your_save_root/diffusion_model'  # path to pretrained model

Train

Environment Configuration

git clone https://github.com/guanhaisu/OBSD.git
cd OBS_Diffusion

conda create -n OBSD python=3.9
conda activate OBSD
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
pip install -r requirements.txt

Start training

CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node=1 train_diffusion.py

You can monitor the training process.

tensorboard --logdir ./logs

Test

CUDA_VISIBLE_DEVICES=0 python eval_diffusion.py

If you want to refine the generated character results, you can run the following script. Also be careful to change your file paths.

CUDA_VISIBLE_DEVICES=0 python refine.py

You can find the FontDiffuser weights here, GoogleDrive.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
OBS_Diffusion		OBS_Diffusion
data		data
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OBSD: Deciphering Oracle Bone Language with Diffusion Models

Haisu Guan, Huanxin Yang, Xinyu Wang, Shengwei Han, Yongge Liu, Lianwen Jin, Xiang Bai, Yuliang Liu

News

Oracle Bone Script Decipher.

Welcome to OBSD. Paper has been published, and we show the Demo.
The dataset is available here.

Data preparation

You can arbitrarily divide the training and test sets from the dataset and place them in the following format. The image names in the input folder and the target folder need to correspond one to one. The input folder stores OBS images, and the target folder stores modern Chinese character images.

You also need to modify the following path to configs.yaml.

Train

Environment Configuration

Start training

You can monitor the training process.

Test

If you want to refine the generated character results, you can run the following script. Also be careful to change your file paths.

About

Releases

Packages

Languages

209-Tongji/OBSD

Folders and files

Latest commit

History

Repository files navigation

OBSD: Deciphering Oracle Bone Language with Diffusion Models

Haisu Guan, Huanxin Yang, Xinyu Wang, Shengwei Han, Yongge Liu, Lianwen Jin, Xiang Bai, Yuliang Liu

News

Oracle Bone Script Decipher.

Welcome to OBSD. Paper has been published, and we show the Demo. The dataset is available here.

Data preparation

You can arbitrarily divide the training and test sets from the dataset and place them in the following format. The image names in the input folder and the target folder need to correspond one to one. The input folder stores OBS images, and the target folder stores modern Chinese character images.

You also need to modify the following path to configs.yaml.

Train

Environment Configuration

Start training

You can monitor the training process.

Test

If you want to refine the generated character results, you can run the following script. Also be careful to change your file paths.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Welcome to OBSD. Paper has been published, and we show the Demo.
The dataset is available here.

Packages