Switch to Discriminative Image Captioning

The code for our paper, Switching to Discriminative Image Captioning by Relieving a Bottleneck of Reinforcement Learning (WACV 2023). Our methods implemented here provide a switch to discriminative image captioning: given off-the-shelf captioning models trained with reinforcement learning, our methods enable them to describe characteristic details of input images with only a lightweight fine-tuning.

Acknowledgment

The code is based on self-critical.pytorch. We thank the authors of the repository, the original neuraltalk2, and awesome PyTorch team.

Setup

git clone https://github.com/ukyh/switch_disc_caption.git
cd switch_disc_caption
git submodule update --init --recursive

conda create --name switch_disc_cap python=3.6
conda activate switch_disc_cap

pip install -r requirements.txt

Downloads

Follow the instruction in data/README.md to download and preprcess data.
Follow the instruction in coco-caption/README.md to download evaluation tools.
Download pre-trained models from MODEL_ZOO.md. We used Att2in+self_critical (att2in_scst), UpDown+self_critical (updown_scst), and Transformer+self_critical (trans_scst) for the experiments of our paper. To run expt_scripts, downloaded models have to be placed as follows:

./saved_models/
  ├── att2in_scst/
  │     ├── model-best.pth
  │     └── infos_a2i2_sc-best.pkl
  ├── updown_scst/
  │     ├── model-best.pth
  │     └── infos_tds_sc-best.pkl
  └── trans_scst/
        ├── model-best.pth
        └── infos_trans_scl-best.pkl

(Optional: not necessary if you just want to try our fine-tuning)
If you want to train RL models in this repo, get the cache for calculating cider score:

python scripts/prepro_ngrams.py --input_json data/dataset_coco.json --dict_json data/cocotalk.json --output_pkl data/coco-train --split train

Run

Fine-Tuning

Run sh expt_scripts/[SELECT_SCRIPT].sh.
It returns a fine-tuned model under saved_models and .json output files (MS COCO Karpathy val/test split) under eval_results.

We have released the fine-tuned models and output files here.

Evaluation

Evaluation uses the output files under eval_results. Use the following repositories/scripts for evaluation in each metric.
NOTE: DO NOT use the files start with tmpeval_ as the decoding methods of those outputs (beam size and BP decoding) are not specified correctly.

CIDEr, SPICE, CLIPScore, RefCLIPScore: https://github.com/ukyh/clipscore_cocout.git
R@K: https://github.com/ukyh/vsepp_cocout.git
TIGEr: https://github.com/ukyh/tiger_cocout.git
improved BERTScore: https://github.com/ukyh/bertspp_cocout.git
Unique-1/S, Length, Repetition: python stats_vocab.py eval_results/FILE_NAME.json
OOR: python stats_oor.py eval_results/FILE_NAME.json

Reference

If you find this repo useful, please consider citing (no obligation at all):

@inproceedings{honda2023switch,
  title={Switching to Discriminative Image Captioning by Relieving a Bottleneck of Reinforcement Learning},
  author={Honda, Ukyo and Taro, Watanabe and Yuji, Matsumoto},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year={2023}
}

@article{luo2018discriminability,
  title={Discriminability objective for training descriptive captions},
  author={Luo, Ruotian and Price, Brian and Cohen, Scott and Shakhnarovich, Gregory},
  journal={arXiv preprint arXiv:1803.04376},
  year={2018}
}

Name		Name	Last commit message	Last commit date
Latest commit History 398 Commits
captioning		captioning
cider @ e9b736d		cider @ e9b736d
coco-caption @ ea20010		coco-caption @ ea20010
configs		configs
data		data
expt_scripts		expt_scripts
projects		projects
saved_models		saved_models
scripts		scripts
test		test
tools		tools
vis		vis
.gitmodules		.gitmodules
ADVANCED.md		ADVANCED.md
LICENSE		LICENSE
MODEL_ZOO.md		MODEL_ZOO.md
README.md		README.md
cider_btw.py		cider_btw.py
requirements.txt		requirements.txt
setup.py		setup.py
stats_human_caps.py		stats_human_caps.py
stats_oor.py		stats_oor.py
stats_vocab.py		stats_vocab.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Switch to Discriminative Image Captioning

Acknowledgment

Setup

Downloads

Run

Fine-Tuning

Evaluation

Reference

About

Releases

Packages

Contributors 7

Languages

License

ukyh/switch_disc_caption

Folders and files

Latest commit

History

Repository files navigation

Switch to Discriminative Image Captioning

Acknowledgment

Setup

Downloads

Run

Fine-Tuning

Evaluation

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages