3DV 2021: Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry
Cho-Ying Wu, Qiangeng Xu, Ulrich Neumann, CGIT Lab at University of Souther California
[paper] [video] [project page]
This paper supersedes the previous version of M3-LRN.
👍 SOTA on all 3D facial alignment, face orientation estimation, and 3D face modeling.
👍 Fast inference with 3000fps on a laptop RTX 2080.
👍 Simple implementation with only widely used operations.
(This project is built/tested on Python 3.8 and PyTorch 1.9 on a compatible GPU)
-
Clone
git clone https://github.com/choyingw/SynergyNet
cd SynergyNet
-
Use conda
conda create --name SynergyNet
conda activate SynergyNet
-
Install pre-requisite common packages
PyTorch 1.9 (should also be compatiable with 1.0+ versions), Torchvision, Opencv, Scipy, Matplotlib, Cython
-
Download data [here] and [here]. Extract these data under the repo root.
These data are processed from [3DDFA] and [FSA-Net].
Download pretrained weights [here]. Put the model under 'pretrained/'
-
Compile Sim3DR and FaceBoxes:
cd Sim3DR
./build_sim3dr.sh
cd ../FaceBoxes
./build_cpu_nms.sh
cd ..
-
Inference
python singleImage.py -f img
The default inference requires a compatible GPU to run. If you would like to run on a CPU, please comment the .cuda() and load the pretrained weights into cpu.
-
Follow Single Image Inference Demo: Step 1-4
-
Benchmarking
python benchmark.py -w pretrained/best.pth.tar
Print-out results and visualization fo first-50 examples are stored under 'results/' (see 'demo/' for some pre-generated samples as references) are shown.
Update 12/14/2021: Best head pose estimation [pretrained model] that comforms to the one reported in the paper. Use -w to load different pretrained models.
-
Follow Single Image Inference Demo: Step 1-4.
-
Download training data from [3DDFA]: train_aug_120x120.zip and extract the zip file under the root folder (Containing about 680K images).
-
bash train_script.sh
-
Please refer to train_script for hyperparameters, such as learning rate, epochs, or GPU device. The default settings take ~19G on a 3090 GPU and about 6 hours for training. If your GPU is less than this size, please decrease the batch size and learning rate proportionally.
-
Follow Single Image Inference Demo: Step 1-5.
-
Download artistic faces data [here], which are from [AF-Dataset]. Download our predicted UV maps [here] by UV-texture GAN. Extract them under the root folder.
-
python artistic.py -f art-all --png
(whole folder)python artistic.py -f art-all/122.png
(single image)
Note that this artistic face dataset contains many different level/style face abstration. If a testing image is close to real, the result is much better than those of highly abstract samples.
-
Follow Single Image Inference Demo: Step 1-5.
-
Download our predicted UV maps and real face images for AFLW2000-3D [here] by UV-texture GAN. Extract them under the root folder.
-
python uv_texture_realFaces.py -f texture_data/real --png
(whole folder)python uv_texture_realFaces.py -f texture_data/real/image00002_real_A.png
(single image)
The results (3D meshes and renderings) are stored under 'inference_output'
We show a comparison with [DECA] using the top-3 largest roll angle samples in AFLW2000-3D.
Facial alignemnt on AFLW2000-3D (NME of facial landmarks):
Face orientation estimation on AFLW2000-3D (MAE of Euler angles):
Results on artistic faces:
Related Project
[Voice2Mesh] (analysis on relation for voice and 3D face)
Bibtex
If you find our work useful, please consider to cite our work
@INPROCEEDINGS{wu2021synergy,
author={Wu, Cho-Ying and Xu, Qiangeng and Neumann, Ulrich},
booktitle={2021 International Conference on 3D Vision (3DV)},
title={Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry},
year={2021}
}
Acknowledgement
The project is developed on [3DDFA] and [FSA-Net]. Thank them for their wonderful work. Thank [3DDFA-V2] for the face detector and rendering codes.