PyTorch code of our paper: X-GGM: Graph Generative Modeling for Out-of-Distribution Generalization in Visual Question Answering (MM 2021).
This implementation is based on LXMERT. Thanks for their pioneering work.
pip install -r requirements.txt
Please see data/README.md for details and download from here.
├── data
│ ├── gqa_imgfeat
│ │ ├── testdev_all_obj36_adj_v2.h5
│ │ ├── testdev_all_obj36.h5
│ │ ├── testdev_all_obj36_info.json
│ │ ├── testdev_head_obj36.h5
│ │ ├── testdev_head_obj36_info.json
│ │ ├── testdev_tail_obj36.h5
│ │ ├── testdev_tail_obj36_info.json
│ │ ├── train_obj36_adj_v2.h5
│ │ ├── train_obj36.h5
│ │ ├── train_obj36_info.json
│ │ ├── val_all_obj36_adj_v2.h5
│ │ ├── val_all_obj36.h5
│ │ ├── val_all_obj36_info.json
│ │ ├── val_tail_obj36_adj_v2.h5
│ │ ├── val_tail_obj36.h5
│ │ └── val_tail_obj36_info.json
│ ├── gqa_ood
│ │ ├── answer_embeds.pickle
│ │ ├── testdev_all.json
│ │ ├── testdev_head.json
│ │ ├── testdev_tail.json
│ │ ├── train.json
│ │ ├── trainval_ans2label.json
│ │ ├── trainval_label2ans.json
│ │ ├── val_all.json
│ │ ├── val_head.json
│ │ └── val_tail.json
│ ├── lxmert
│ │ └── all_ans.json
│ ├── mscoco_imgfeat
│ │ ├── dev_test_obj36_adj_v2.h5
│ │ ├── dev_test_obj36.h5
│ │ ├── dev_test_obj36_info.json
│ │ ├── test_obj36_adj_v2.h5
│ │ ├── test_obj36.h5
│ │ ├── test_obj36_info.json
│ │ ├── train_obj36_adj_v2.h5
│ │ ├── train_obj36.h5
│ │ ├── train_obj36_info.json
│ │ ├── val_obj36_adj.h5
│ │ ├── val_obj36.h5
│ │ └── val_obj36_info.json
│ ├── vqa
│ │ ├── minival.json
│ │ ├── nominival.json
│ │ ├── train.json
│ │ ├── trainval_ans2label.json
│ │ └── trainval_label2ans.json
│ └── vqacpv2
│ ├── dev_test_annotations.json
│ ├── test_annotations.json
│ ├── train_annotations.json
│ ├── trainval_ans2label.json
│ ├── trainval_label2ans.json
│ └── val_annotations.json
Please see the parameter settings and running details in the scripts.
- For VQA-CP v2
bash script/vqacpv2.sh
- For GQA-OOD
bash script/gqa_ood.sh
If you find our code is helpful for your research, please cite:
@inproceedings{jiang2021x,
title={X-GGM: Graph generative modeling for out-of-distribution generalization in visual question answering},
author={Jiang, Jingjing and Liu, Ziyi and Liu, Yifan and Nan, Zhixiong and Zheng, Nanning},
booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
pages={199--208},
year={2021}
}