GitHub - szhernovoy/MiVOLO: MiVOLO age & gender transformer neural network

MiVOLO: Multi-input Transformer for Age and Gender Estimation

MiVOLO: Multi-input Transformer for Age and Gender Estimation, Maksim Kuprashevich, Irina Tolstykh, 2023 arXiv 2307.04616

[Paper] [Demo] [BibTex] [Data]

MiVOLO pretrained models

Gender & Age recognition performance.

Model	Type	Dataset	Age MAE	Age CS@5	Gender Accuracy	download
volo_d1	face_only, age	IMDB-cleaned	4.29	67.71	-	checkpoint
volo_d1	face_only, age, gender	IMDB-cleaned	4.22	68.68	99.38	checkpoint
mivolo_d1	face_body, age, gender	IMDB-cleaned	4.24 [face+body] 6.87 [body]	68.32 [face+body] 46.32 [body]	99.46 [face+body] 96.48 [body]	checkpoint
volo_d1	face_only, age	UTKFace	4.23	69.72	-	checkpoint
volo_d1	face_only, age, gender	UTKFace	4.23	69.78	97.69	checkpoint
mivolo_d1	face_body, age, gender	Lagenda	3.99 [face+body]	71.27 [face+body]	97.36 [face+body]	demo

Dataset

Please, cite our paper if you use any this data!

Lagenda dataset: images and annotation.
IMDB-clean: follow these instructions to get images and download our annotations.
UTK dataset: origin full images and our annotation: split from the article, random full split.
Adience dataset: follow these instructions to get images and download our annotations.
Click to expand!

After downloading them, your data directory should look something like this:
```
data
└── Adience
    ├── annotations  (folder with our annotations)
    ├── aligned      (will not be used)
    ├── faces
    ├── fold_0_data.txt
    ├── fold_1_data.txt
    ├── fold_2_data.txt
    ├── fold_3_data.txt
    └── fold_4_data.txt
```
We use coarse aligned images from faces/ dir.

Using our detector we found a face bbox for each image (see tools/prepare_adience.py).

This dataset has five folds. The performance metric is accuracy on five-fold cross validation.

images before removal fold 0 fold 1 fold 2 fold 3 fold 4

19,370 4,484 3,730 3,894 3,446 3,816

Not complete data

only age not found only gender not found SUM

40 1170 1,210 (6.2 %)

Removed data

failed to process image age and gender not found SUM

0 708 708 (3.6 %)

Genders

female male

9,372 8,120

Ages (8 classes) after mapping to not intersected ages intervals

0-2 4-6 8-12 15-20 25-32 38-43 48-53 60-100

2,509 2,140 2,293 1,791 5,589 2,490 909 901
FairFace dataset: follow these instructions to get images and download our annotations.
Click to expand!

After downloading them, your data directory should look something like this:
```
data
└── FairFace
   ├── annotations  (folder with our annotations)
   ├── fairface-img-margin025-trainval   (will not be used)
       ├── train
       ├── val
   ├── fairface-img-margin125-trainval
       ├── train
       ├── val
   ├── fairface_label_train.csv
   ├── fairface_label_val.csv
```
We use aligned images from fairface-img-margin125-trainval/ dir.

Using our detector we found a face bbox for each image and added a person bbox if it was possible (see tools/prepare_fairface.py).

This dataset has 2 splits: train and val. The performance metric is accuracy on validation.

images train images val

86,744 10,954

Genders for validation

female male

5,162 5,792

Ages for validation (9 classes):

0-2 3-9 10-19 20-29 30-39 40-49 50-59 60-69 70+

199 1,356 1,181 3,300 2,330 1,353 796 321 118

Install

Install pytorch 1.13+ and other requirements.

pip install -r requirements.txt
pip install .

Demo

Download body + face detector model to models/yolov8x_person_face.pt
Download mivolo checkpoint to models/mivolo_imbd.pth.tar

wget https://variety.com/wp-content/uploads/2023/04/MCDNOHA_SP001.jpg -O jennifer_lawrence.jpg

python3 demo.py \
--input "jennifer_lawrence.jpg" \
--output "output" \
--detector-weights "models/yolov8x_person_face.pt " \
--checkpoint "models/mivolo_imbd.pth.tar" \
--device "cuda:0" \
--with-persons \
--draw

Validation

To reproduce validation metrics:

Download prepared annotations for imbd-clean / utk / adience / lagenda / fairface.
Download checkpoint
Run validation:

python3 eval_pretrained.py \
  --dataset_images /path/to/dataset/utk/images \
  --dataset_annotations /path/to/dataset/utk/annotation \
  --dataset_name utk \
  --split valid \
  --batch-size 512 \
  --checkpoint models/mivolo_imbd.pth.tar \
  --half \
  --with-persons \
  --device "cuda:0"

Supported dataset names: "utk", "imdb", "lagenda", "fairface", "adience".

License

Please, see here

Citing

If you use our models, code or dataset, we kindly request you to cite the following paper and give repository a ⭐

@article{mivolo2023,
   Author = {Maksim Kuprashevich and Irina Tolstykh},
   Title = {MiVOLO: Multi-input Transformer for Age and Gender Estimation},
   Year = {2023},
   Eprint = {arXiv:2307.04616},
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
images		images
license		license
mivolo		mivolo
scripts		scripts
tools		tools
README.md		README.md
demo.py		demo.py
eval_pretrained.py		eval_pretrained.py
eval_tools.py		eval_tools.py
measure_time.py		measure_time.py
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MiVOLO: Multi-input Transformer for Age and Gender Estimation

MiVOLO pretrained models

Dataset

Install

Demo

Validation

License

Citing

About

Releases

Packages

Languages

szhernovoy/MiVOLO

Folders and files

Latest commit

History

Repository files navigation

MiVOLO: Multi-input Transformer for Age and Gender Estimation

MiVOLO pretrained models

Dataset

Install

Demo

Validation

License

Citing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages