The code for my undergraduate thesis.
- Install from pip
pip install rmn
# or build from source
git clone [email protected]:phamquiluan/ResidualMaskingNetwork.git
cd ResidualMaskingNetwork
pip install -e .
- Run demo in Python (with webcam available)
from rmn import RMN
m = RMN()
m.video_demo()
- Detect emotions in single images
image = cv2.imread("some-image-path.png")
results = m.detect_emotion_for_single_frame(image)
print(results)
image = m.draw(image, results)
cv2.imwrite("output.png", image)
- Recent Update
- Benchmarking on FER2013
- Benchmarking on ImageNet
- Installation
- Download datasets
- Training on FER2013
- Training on ImageNet
- Evaluation results
- Download dissertation and slide
- [07/03/2023] Re-structure, update Readme
- [05/05/2021] Release ver 2, add colab
- [27/02/2021] Add paper
- [14/01/2021] Packaging Project and publish
rmn
on Pypi - [27/02/2020] Update Tensorboard visualizations and Overleaf source
- [22/02/2020] Test-time augmentation implementation.
- [21/02/2020] Imagenet training code and trained weights released.
- [21/02/2020] Imagenet evaluation results released.
- [10/01/2020] Checking demo stuff and training procedure works on another machine
- [09/01/2020] First time upload
We benchmark our code thoroughly on two datasets: FER2013 and VEMO. Below are the results and trained weights:
Model | Accuracy |
---|---|
VGG19 | 70.80 |
EfficientNet_b2b | 70.80 |
Googlenet | 71.97 |
Resnet34 | 72.42 |
Inception_v3 | 72.72 |
Bam_Resnet50 | 73.14 |
Densenet121 | 73.16 |
Resnet152 | 73.22 |
Cbam_Resnet50 | 73.39 |
ResMaskingNet | 74.14 |
ResMaskingNet + 6 | 76.82 |
Results in VEMO dataset could be found in my thesis or slide (attached below)
We also benchmark our model on ImageNet dataset.
Model | Top-1 Accuracy | Top-5 Accuracy |
---|---|---|
Resnet34 | 72.59 | 90.92 |
CBAM Resnet34 | 73.77 | 91.72 |
ResidualMaskingNetwork | 74.16 | 91.91 |
- Install PyTorch by selecting your environment on the website and running the appropriate command.
- Clone this repository and install package prerequisites below.
- Then download the dataset by following the instructions below.
- Python 3.6+
- setup.py
- FER2013 Dataset (locate it in
saved/data/fer2013
likesaved/data/fer2013/train.csv
) - ImageNet 1K Dataset (ensure it can be loaded by torchvision.datasets.Imagenet)
- To train network, you need to specify model name and other hyperparameters in config file (located at configs/*) then ensure it is loaded in main file, then run training procedure by simply run main file, for example:
python main_fer.py # Example for fer2013_config.json file
- The best checkpoints will chosen at term of best validation accuracy, located at
saved/checkpoints
- The TensorBoard training logs are located at
saved/logs
, to open it, usetensorboard --logdir saved/logs/
- By default, it will train
alexnet
model, you can switch to another model by editconfigs/fer2013\_config.json
file (toresnet18
orcbam\_resnet50
or my networkresmasking\_dropout1
.
To perform training resnet34 on 4 V100 GPUs on a single machine:
python ./main_imagenet.py -a resnet34 --dist-url 'tcp://127.0.0.1:12345' --dist-backend 'nccl' --multiprocessing-distributed --world-size 1 --rank 0
For student, who takes care of font family of confusion matrix and would like to write things in LaTeX, below is an example for generating a striking confusion matrix.
(Read this article for more information, there will be some bugs if you blindly run the code without reading).
python cm_cbam.py
I used no-weighted sum avarage ensemble method to fusing 7 different models together, to reproduce results, you need to do some steps:
- Download all needed trained weights and located on
./saved/checkpoints/
directory. Link to download can be found on Benchmarking section. - Edit file
gen_results
and run it to generate result offline for each model. - Run
gen_ensemble.py
file to generate accuracy for example methods.
- Dissertation PDF (in Vietnamese)
- Dissertation Overleaf Source
- Presentation slide PDF (in English) with full appendix
- Presentation slide Overleaf Source
- Paper
Note: Unfortunately, I am currently engaged in a full-time job and conducting research on another topic. Therefore, I will do my best to keep things up to date, but I cannot guarantee that I will be able to do so. That being said, I am grateful to everyone for their continued help and feedback, as it is truly appreciated. I will endeavor to address everything as soon as possible.
Pham, Luan, The Huynh Vu, and Tuan Anh Tran. "Facial expression recognition using residual masking network." 2020 25Th international conference on pattern recognition (ICPR). IEEE, 2021.
@inproceedings{pham2021facial,
title={Facial expression recognition using residual masking network},
author={Pham, Luan and Vu, The Huynh and Tran, Tuan Anh},
booktitle={2020 25Th international conference on pattern recognition (ICPR)},
pages={4513--4519},
year={2021},
organization={IEEE}
}