[ACCV 2022] An official implement of the paper MGTR: End-to-end Mutual Gaze Detection with Transformer.
- Python >= 3.6 (Recommend to use Anaconda)
- PyTorch >= 1.7.1
- TorchVision>=0.8.2
- NVIDIA GPU + CUDA
- Opencv-python>=4.5.1
Model | AVA-LAEO | UCO-LAEO |
---|---|---|
MGTR | 66.2 | 64.8 |
-
Clone this github repo.
[email protected]:Gmbition/MGTR.git cd MGTR
-
Download Mutual Gaze Datasets from Baidu Drive and put the annotation json files to
./data
. -
Download our trained model from here and move them to
./data/mgtr_pretrained
(need to creat this newmgtr_pretrained
file). -
Run testing for MGTR.
python3 test.py --backbone=resnet50 --batch_size=1 --log_dir=./ --model_path=your_model_path
-
The visualization results (if set
save_image = True
) will be sorted in./log
.
We annotate each mutual gaze instance in one image as a dict and the annoataion is stored in ./data
. There are four annotation json files for AVA-LAEO and UCO-LAEO training and testing respectively. The specific format of one mutual gaze instance annoatation is as follow:
{
"file_name": "scence_name/image.jpg",
"width": width of the image,
"height": height of the image,
"gt_bboxes": [{"tag": 1,
"box": a list containing the [x,y,w,h] of the box},
...],
"laeo": [{"person_1": the idx of person1,
"person_2": the idx of person2,
"interaction": whether looking at each other}]
}
Please cite us if this work is helpful to you.
@inproceedings{guo2022mgtr,
title={MGTR: End-to-End Mutual Gaze Detection with Transformer},
author={Guo, Hang and Hu, Zhengxi and Liu, Jingtai},
booktitle={Proceedings of the Asian Conference on Computer Vision},
pages={1590--1605},
year={2022}
}
We sincerely thank the cool work by very cool people 😎 DETR, HoiTransformer.