Haiyi Mei1, Chi Sing Leung2, Ziwei Liu4, Lei Yang1, 5, Zhongang Cai✉, 1, 4, 5,
3International Digital Economy Academy (IDEA),
4S-Lab, Nanyang Technological University, 5Shanghai AI Laboratory
AiOS performs human localization and SMPL-X estimation in a progressive manner. It is composed of (1) the body localization stage that predicts coarse human location; (2) the Body refinement stage that refines body features and produces face and hand locations; (3) the Whole-body Refinement stage that refines whole-body features and regress SMPL-X parameters.
- download all datasets
- process all datasets into HumanData format. We provided the proccessed npz file, which can be download from here.
- download SMPL-X
- download AiOS checkpoint
The file structure should be like:
AiOS/
├── config/
└── data
├── body_models
| ├── smplx
| | ├──MANO_SMPLX_vertex_ids.pkl
| | ├──SMPL-X__FLAME_vertex_ids.npy
| | ├──SMPLX_NEUTRAL.pkl
| | ├──SMPLX_to_J14.pkl
| | ├──SMPLX_NEUTRAL.npz
| | ├──SMPLX_MALE.npz
| | └──SMPLX_FEMALE.npz
| └── smpl
| ├──SMPL_FEMALE.pkl
| ├──SMPL_MALE.pkl
| └──SMPL_NEUTRAL.pkl
├── preprocessed_npz
│ └── cache
| ├──agora_train_3840_w_occ_cache_2010.npz
| ├──bedlam_train_cache_080824.npz
| ├──...
| └──coco_train_cache_080824.npz
├── checkpoint
│ └── aios_checkpoint.pth
├── datasets
│ ├── agora
| │ └──3840x2160
│ │ ├──train
│ │ └──test
│ ├── bedlam
│ │ ├──train_images
│ │ └──test_images
│ ├── ARCTIC
│ │ ├──s01
│ │ ├──s02
│ │ ├──...
│ │ └──s10
│ ├── EgoBody
│ │ ├──egocentric_color
│ │ └──kinect_color
│ └── UBody
| └──images
└── checkpoint
├── edpose_r50_coco.pth
└── aios_checkpoint.pth
# Create a conda virtual environment and activate it.
conda create -n aios python=3.8 -y
conda activate aios
# Install PyTorch and torchvision.
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
# Install Pytorch3D
git clone -b v0.6.1 https://github.com/facebookresearch/pytorch3d.git
cd pytorch3d
pip install -v -e .
cd ..
# Install MMCV, build from source
git clone -b v1.6.1 https://github.com/open-mmlab/mmcv.git
cd mmcv
export MMCV_WITH_OPS=1
export FORCE_MLU=1
pip install -v -e .
cd ..
# Install other dependencies
conda install -c conda-forge ffmpeg
pip install -r requirements.txt
# Build deformable detr
cd models/aios/ops
python setup.py build install
cd ../../..
- Place the mp4 video for inference under
AiOS/demo/
- Prepare the pretrained models to be used for inference under
AiOS/data/checkpoint
- Inference output will be saved in
AiOS/demo/{INPUT_VIDEO}_out
# CHECKPOINT: checkpoint path
# INPUT_VIDEO: input video path
# OUTPUT_DIR: output path
# NUM_PERSON: num of person. This parameter sets the expected number of persons to be detected in the input (image or video).
# The default value is 1, meaning the algorithm will try to detect at least one person. If you know the maximum number of persons
# that can appear simultaneously, you can set this variable to that number to optimize the detection process (a lower threshold is recommended as well).
# THRESHOLD: socre threshold. This parameter sets the score threshold for person detection. The default value is 0.5.
# If the confidence score of a detected person is lower than this threshold, the detection will be discarded.
# Adjusting this threshold can help in filtering out false positives or ensuring only high-confidence detections are considered.
# GPU_NUM: GPU num.
sh scripts/inference.sh {CHECKPOINT} {INPUT_VIDEO} {OUTPUT_DIR} {NUM_PERSON} {THRESHOLD} {THRESHOLD}
# For inferencing short_video.mp4 with output directory of demo/short_video_out
sh scripts/inference.sh data/checkpoint/aios_checkpoint.pth short_video.mp4 demo 2 0.1 8
NMVE | NMJE | MVE | MPJPE | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
DATASETS | FB | B | FB | B | FB | B | F | LH/RH | FB | B | F | LH/RH |
BEDLAM | 87.6 | 57.7 | 85.8 | 57.7 | 83.2 | 54.8 | 26.2 | 28.1/30.8 | 81.5 | 54.8 | 26.2 | 25.9/28.0 |
AGORA-Test | 102.9 | 63.4 | 100.7 | 62.5 | 98.8 | 60.9 | 27.7 | 42.5/43.4 | 96.7 | 60.0 | 29.2 | 40.1/41.0 |
AGORA-Val | 105.1 | 60.9 | 102.2 | 61.4 | 100.9 | 60.9 | 30.6 | 43.9/45.6 | 98.1 | 58.9 | 32.7 | 41.5/43.4 |
a. Make test_result dir
mkdir test_result
b. AGORA Validatoin
Run the following command and it will generate a 'predictions/' result folder which can evaluate with the agora evaluation tool
sh scripts/test_agora_val.sh data/checkpoint/aios_checkpoint.pth agora_val
b. AGORA Test Leaderboard
Run the following command and it will generate a 'predictions.zip' which can be submitted to AGORA Leaderborad
sh scripts/test_agora.sh data/checkpoint/aios_checkpoint.pth agora_test
c. BEDLAM
Run the following command and it will generate a 'predictions.zip' which can be submitted to BEDLAM Leaderborad
sh scripts/test_bedlam.sh data/checkpoint/aios_checkpoint.pth bedlam_test
Some of the codes are based on MMHuman3D
, ED-Pose
and SMPLer-X
.