We are excited to announce that our work has led to a new article. Explore more about our latest findings and advancements.
Developed by Isabel Jiménez-Velasco, Rafael Muñoz-Salinas, and Manuel J. Marín-Jiménez.
Figure 1: Touch codes in Proxemics. Images showing the six specific "touch codes" that were studied in this work.
Proxemics is a branch of anthropology that studies how humans use personal space as a means of nonverbal communication; that is, it studies how people interact. Due to the presence of physical contact between people, in the problem of proxemics recognition in images, we have to deal with occlusions and ambiguities, which complicates the process of recognition. Several papers have proposed different methods and models to solve this problem in recent years. Over the last few years, the rapid advancement of powerful Deep Learning techniques has resulted in novel methods and approaches. So, we propose Proxemics-Net, a new model that allows us to study the performance of two state-of-the-art deep learning architectures, ConvNeXt and Visual Transformers (as backbones) on the problem of classifying different types of proxemics on still images. Experiments on the existing Proxemics dataset show that these deep learning models do help favorably in the problem of proxemics recognition since we considerably outperformed the existing state of the art, with the ConvNeXt architecture being the best-performing backbone.
Figure 2: Our Proxemics-Net model. It consists of the individual branches of each person ('p0_branch' and 'p1_branch') (blue) and the 'pair branch' (red) as input. All branches consist of the same type of backbone (ConvNeXt or ViT). The outputs of these 3 branches are merged in a concatenation layer and passed through a fully connected layer that predicts the proxemic classes of the input samples.
Model | HH | HS | SS | HT | HE | ES | mAP (a) | mAP (b) |
---|---|---|---|---|---|---|---|---|
Yang et al. | 37 | 29 | 50 | 61 | 38 | 34 | 42 | 38 |
Chu et al. | 41.2 | 35.4 | 62.2 | - | 43.9 | 55 | - | 46.6 |
Jiang et al. | 59.7 | 52 | 53.9 | 33.2 | 36.1 | 36.2 | 45.2 | 47.5 |
Our ViT | 48.2 | 57.6 | 50.4 | 76.6 | 57.6 | 52.8 | 57.2 | 53.3 |
Our ConvNeXt_Base | 56.9 | 53.4 | 61.4 | 83.4 | 68.7 | 58.8 | 63.8 | 59.8 |
Our ConvNeXt_Large | 62.4 | 56.7 | 62.4 | 86.4 | 68.8 | 67.9 | 67.4 | 63.8 |
Table 1: Table compares our three best models concerning the existing state of the art.
In this Table, two values of %mAP are compared: mAP(a) is the value of mAP explained in the previous sections (the mean of the AP values of the six types of proxemics) and mAP(b) is the mean of the AP values but excluding the Hand-Torso (HT) class as done in Chu et al.
Looking at the table, we can see that our three proposed models (which use three branches as input) perform the best in both comparatives (mAP(a-b)), with the model that uses the ConvNeXt network as a backbone achieving the highest %mAP value (67.4% vs 47.5% mAP(a) and 63.8% vs 47.5% mAP(b)). Thus, we outperformed the existing state of the art by a significant margin, with improvements of up to 19.9% of %AP (mAP(a)) and 16.3% of %mAP (mAP(b)).
Therefore, these results demonstrate that the two state-of-the-art deep learning models (ConvNeXt and Vision Transformers) do help in the proxemics recognition problem since, using only RGB information, they can considerably improve the results obtained by all competing models.
base_model_main/
: Main directory for the base model.dataset/
: Directory containing the code necessary for dataset preprocessing.ìmg/
: Directory containing the images of this worktest/
: Directory containing code and resources related to model testing.train/
: Directory containing code and resources related to model training.dataset_proxemics_IbPRIA.zip
: ZIP file containing the preprocessed dataset.requirements.txt
: File specifying the necessary dependencies for the project.
To install the necessary dependencies to run this project, you can use the following command:
conda create --name <env> --file requirements.txt
To use the preprocessed dataset, you must first unzip the dataset_proxemics_IbPRIA.zip
file and place it two directories above the current directory. You can use the following command:
unzip dataset_proxemics_IbPRIA.zip -d ../
To use the pre-trained ConvNeXt models that we have selected as a backbone to train our Proxemics-Net models, you need to download them from the following locations:
- Pre-trained Base model: Download here (350MB)
- Pre-trained Large model: Download here (800MB)
Once downloaded, you need to unzip them and place them one level above, i.e., in ../premodels/.
To train and test a new model, you should access the base_model_main
directory and execute the following command lines based on the type of model you want to use:
-
-
Full Model (3 Branches)
python3 base_model_main_ViT.py --datasetDIR <DIR dataset/> --outModelsDIR <DIR where you'll save the model> --b <batchsize> --set <set1/set2> --lr <learningRate>
-
Only Pair RGB
python3 base_model_main_ViT.py --datasetDIR <DIR dataset/> --outModelsDIR <DIR where you'll save the model> --b <batchsize> --set <set1/set2> --lr <learningRate> --onlyPairRGB
-
-
-
Full Model (3 Branches)
python3 base_model_main_convNext.py --datasetDIR <DIR dataset/> --outModelsDIR <DIR where you'll save the model> --modeltype <base/large> --b <batchsize> --set <set1/set2> --lr <learningRate>
-
Only Pair RGB
python3 base_model_main_convNext.py --datasetDIR <DIR dataset/> --outModelsDIR <DIR where you'll save the model> --modeltype <base/large> --b <batchsize> --set <set1/set2> --lr <learningRate> --onlyPairRGB
-
Be sure to adjust the values between <...> with the specific paths and configurations required for your project.
Here are 2 of the Proxemics-Net models that we have trained.
- A model with ConvNeXt Large as backbone. This model has been the one with the best results (see SOTA table). It has been trained with RGB information of the individual clippings and the pairs (the 3 input branches). Download here (4.45GB)
- A model with ConvNext Base as backbone. This model has been trained only with the RGB information of the pairs. Download here (1.13GB)
🤩 You can test these models in the Google Colab Demo we have prepared for you.
If you find Proxemics-Net useful in your work, please consider citing the following BibTeX entry:
@InProceedings{jimenezVelasco2023,
author = "Jiménez, I. and Muñoz, R. and Marín, M. J.",
title = "Proxemics-Net: Automatic Proxemics Recognition in Images",
booktitle = "Pattern Recogn. Image Anal.",
year = "2023",
pages = "402-413",
note= "IbPRIA 2023",
doi = "10.1007/978-3-031-36616-1_32"
}