GitHub - gfacchi-dev/MiVOLO: MiVOLO age & gender transformer neural network

MiVOLO: Multi-input Transformer for Age and Gender Estimation

MiVOLO: Multi-input Transformer for Age and Gender Estimation, Maksim Kuprashevich, Irina Tolstykh, 2023 arXiv 2307.04616

[Paper] [Demo] [BibTex] [Data]

MiVOLO pretrained models

Gender & Age recognition performance.

Model	Type	Dataset	Age MAE	Age CS@5	Gender Accuracy	download
volo_d1	face_only, age	IMDB-cleaned	4.29	67.71	-	checkpoint
volo_d1	face_only, age, gender	IMDB-cleaned	4.22	68.68	99.38	checkpoint
mivolo_d1	face_body, age, gender	IMDB-cleaned	4.24 [face+body] 6.87 [body]	68.32 [face+body] 46.32 [body]	99.46 [face+body] 96.48 [body]	checkpoint
volo_d1	face_only, age	UTKFace	4.23	69.72	-	checkpoint
volo_d1	face_only, age, gender	UTKFace	4.23	69.78	97.69	checkpoint
mivolo_d1	face_body, age, gender	Lagenda	3.99 [face+body]	71.27 [face+body]	97.36 [face+body]	demo

Dataset

Please, cite our paper if you use any this data!

Lagenda dataset: images and annotation.
IMDB-clean: follow these instructions to get images and download our annotations.
UTK dataset: origin full images and our annotation: split from the article, random full split.
Adience dataset: follow these instructions to get images and download our annotations.
Click to expand!

After downloading them, your data directory should look something like this:
```
data
└── Adience
    ├── annotations  (folder with our annotations)
    ├── aligned      (will not be used)
    ├── faces
    ├── fold_0_data.txt
    ├── fold_1_data.txt
    ├── fold_2_data.txt
    ├── fold_3_data.txt
    └── fold_4_data.txt
```
We use coarse aligned images from faces/ dir.

Using our detector we found a face bbox for each image (see tools/prepare_adience.py).

This dataset has five folds. The performance metric is accuracy on five-fold cross validation.

images before removal fold 0 fold 1 fold 2 fold 3 fold 4

19,370 4,484 3,730 3,894 3,446 3,816

Not complete data

only age not found only gender not found SUM

40 1170 1,210 (6.2 %)

Removed data

failed to process image age and gender not found SUM

0 708 708 (3.6 %)

Genders

female male

9,372 8,120

Ages (8 classes) after mapping to not intersected ages intervals

0-2 4-6 8-12 15-20 25-32 38-43 48-53 60-100

2,509 2,140 2,293 1,791 5,589 2,490 909 901
FairFace dataset: follow these instructions to get images and download our annotations.
Click to expand!

After downloading them, your data directory should look something like this:
```
data
└── FairFace
   ├── annotations  (folder with our annotations)
   ├── fairface-img-margin025-trainval   (will not be used)
       ├── train
       ├── val
   ├── fairface-img-margin125-trainval
       ├── train
       ├── val
   ├── fairface_label_train.csv
   ├── fairface_label_val.csv
```
We use aligned images from fairface-img-margin125-trainval/ dir.

Using our detector we found a face bbox for each image and added a person bbox if it was possible (see tools/prepare_fairface.py).

This dataset has 2 splits: train and val. The performance metric is accuracy on validation.

images train images val

86,744 10,954

Genders for validation

female male

5,162 5,792

Ages for validation (9 classes):

0-2 3-9 10-19 20-29 30-39 40-49 50-59 60-69 70+

199 1,356 1,181 3,300 2,330 1,353 796 321 118

Install

Install pytorch 1.13+ and other requirements.

pip install -r requirements.txt
pip install .

Demo

Download body + face detector model to models/yolov8x_person_face.pt
Download mivolo checkpoint to models/mivolo_imbd.pth.tar

wget https://variety.com/wp-content/uploads/2023/04/MCDNOHA_SP001.jpg -O jennifer_lawrence.jpg

python3 demo.py \
--input "jennifer_lawrence.jpg" \
--output "output" \
--detector-weights "models/yolov8x_person_face.pt " \
--checkpoint "models/mivolo_imbd.pth.tar" \
--device "cuda:0" \
--with-persons \
--draw

To run demo for a youtube video:

python3 demo.py \
--input "https://www.youtube.com/shorts/pVh32k0hGEI" \
--output "output" \
--detector-weights "models/yolov8x_person_face.pt" \
--checkpoint "models/mivolo_imbd.pth.tar" \
--device "cuda:0" \
--draw \
--with-persons

Validation

To reproduce validation metrics:

Download prepared annotations for imbd-clean / utk / adience / lagenda / fairface.
Download checkpoint
Run validation:

python3 eval_pretrained.py \
  --dataset_images /path/to/dataset/utk/images \
  --dataset_annotations /path/to/dataset/utk/annotation \
  --dataset_name utk \
  --split valid \
  --batch-size 512 \
  --checkpoint models/mivolo_imbd.pth.tar \
  --half \
  --with-persons \
  --device "cuda:0"

Supported dataset names: "utk", "imdb", "lagenda", "fairface", "adience".

Changelog

15.08.20223 - 0.4.1dev

Added

Support for video streams, including YouTube URLs
Instructions and explanations for various export types.

Changed

Removed CutOff operation. It has been proven to be ineffective for inference time and quite costly at the same time. Now it is only used during training.

ONNX and TensorRT export

As of now (11.08.2023), while ONNX export is technically feasible, it is not advisable due to the poor performance of the resulting model with batch processing. TensorRT and OpenVINO export is impossible due to its lack of support for col2im.

If you remain absolutely committed to utilizing ONNX export, you can refer to these instructions.

The most highly recommended export method at present is using TorchScript. You can achieve this with a single line of code:

torch.jit.trace(model)

This approach provides you with a model that maintains its original speed and only requires a single file for usage, eliminating the need for additional code.

License

Please, see here

Citing

If you use our models, code or dataset, we kindly request you to cite the following paper and give repository a ⭐

@article{mivolo2023,
   Author = {Maksim Kuprashevich and Irina Tolstykh},
   Title = {MiVOLO: Multi-input Transformer for Age and Gender Estimation},
   Year = {2023},
   Eprint = {arXiv:2307.04616},
}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
images		images
license		license
mivolo		mivolo
scripts		scripts
tools		tools
.flake8		.flake8
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
demo_1frame.py		demo_1frame.py
demo_3frames.py		demo_3frames.py
demo_subsetVIDEO.py		demo_subsetVIDEO.py
eval_pretrained.py		eval_pretrained.py
eval_tools.py		eval_tools.py
measure_time.py		measure_time.py
mypy.ini		mypy.ini
requirements.txt		requirements.txt
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MiVOLO: Multi-input Transformer for Age and Gender Estimation

MiVOLO pretrained models

Dataset

Install

Demo

Validation

Changelog

Added

Changed

ONNX and TensorRT export

License

Citing

About

Releases

Packages

Languages

gfacchi-dev/MiVOLO

Folders and files

Latest commit

History

Repository files navigation

MiVOLO: Multi-input Transformer for Age and Gender Estimation

MiVOLO pretrained models

Dataset

Install

Demo

Validation

Changelog

Added

Changed

ONNX and TensorRT export

License

Citing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages