Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language

This repo contains the Food-500 Cap dataset for our paper: Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language(ACM MM'2023)

We provide the descriptions in two files (finetune_data.json, and evaluation_data.json). The images of the Food-500 Cap are from ISIA Food-500, you can download images from here.

Note:

In our paper we use all data to evaluate the VLMs. For the convenience of everyone's use and comparison, we have divided the dataset into train (finetune_data.json, 19760 pairs) and test (evaluation_data.json, 4940 pairs).

Citation

@article{ma2023food,
  title={Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language Models},
  author={Ma, Zheng and Pan, Mianzhi and Wu, Wenhan and Cheng, Kanzhi and Zhang, Jianbing and Huang, Shujian and Chen, Jiajun},
  journal={arXiv preprint arXiv:2308.03151},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
evaluation_data.json		evaluation_data.json
finetune_data.json		finetune_data.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language

Citation

About

Releases

Packages

aaronma2020/Food500-Cap

Folders and files

Latest commit

History

Repository files navigation

Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages