Context-Aware Visual Policy Network for Sequence-Level Image Captioning

This repository contains the code for the following papers:

Daqing Liu, Zheng-Jun Zha, Hanwang Zhang, Yongdong Zhang, Feng Wu, Context-Aware Visual Policy Network for Sequence-Level Image Captioning. in ACM MM, 2018. (PDF)
Zheng-Jun Zha, Daqing Liu, Hanwang Zhang, Yongdong Zhang, Feng Wu, Context-Aware Visual Policy Network for Fine-Grained Image Captioning. in TPAMI, 2019. (Extended journal version. PDF)

Installation

Install Python 3 (Anaconda recommended).
Install Pytorch v1.0 or higher:

pip3 install torch torchvision

Clone with Git, and then enter the root directory:

git clone --recursive https://github.com/daqingliu/CAVP.git && cd CAVP

Install requirements for evaluation metrics:

apt install default-jdk
cd coco-caption && bash coco-caption/get_stanford_models.sh && cd ..

Download Data

Download the image features (tsv extracted from bottom-up-attention) into data and unzip it.
Convert tsv files to npz files which can be read in dataloader:

python misc/convert_tsv_to_npz.py

Download coco annotations (h5 and json) into data.

Training and Evaluation

Just simply run:

bash run_train.sh
bash run_eval.sh

Citation

@article{zha2019context,
  title={Context-aware visual policy network for fine-grained image captioning},
  author={Zha, Zheng-Jun and Liu, Daqing and Zhang, Hanwang and Zhang, Yongdong and Wu, Feng},
  journal={IEEE transactions on pattern analysis and machine intelligence},
  year={2019},
}

Acknowledgements

Part of this repository is built upon self-critical.pytorch.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
cider @ e88120c		cider @ e88120c
coco-caption @ d389876		coco-caption @ d389876
misc		misc
models		models
tools		tools
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
run_eval.sh		run_eval.sh
run_train.sh		run_train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Context-Aware Visual Policy Network for Sequence-Level Image Captioning

Installation

Download Data

Training and Evaluation

Citation

Acknowledgements

About

Releases

Packages

Languages

License

daqingliu/CAVP

Folders and files

Latest commit

History

Repository files navigation

Context-Aware Visual Policy Network for Sequence-Level Image Captioning

Installation

Download Data

Training and Evaluation

Citation

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages