-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit f86860d
Showing
136 changed files
with
10,857 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
__pycache__/ | ||
.idea/ | ||
.ipynb_checkpoints/ | ||
.DS_Store | ||
*.py[cod] | ||
*.so | ||
*.ply | ||
*.orig | ||
*.o | ||
*.json | ||
*.pth | ||
*.npy | ||
*.ipynb | ||
*.png | ||
*.jpg | ||
data |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
//////////////////////////////////////////////////////////////////////////// | ||
// Copyright 2022-2023 the 3D Vision Group at the State Key Lab of CAD&CG, | ||
// Zhejiang University. All Rights Reserved. | ||
// | ||
// For more information see <https://github.com/zju3dv/Im4D> | ||
// If you use this code, please cite the corresponding publications as | ||
// listed on the above website. | ||
// | ||
// Permission to use, copy, modify and distribute this software and its | ||
// documentation for educational, research and non-profit purposes only. | ||
// Any modification based on this work must be open source and prohibited | ||
// for commercial use. | ||
// You must retain, in the source form of any derivative works that you | ||
// distribute, all copyright, patent, trademark, and attribution notices | ||
// from the source form of this work. | ||
// | ||
// | ||
//////////////////////////////////////////////////////////////////////////// |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,162 @@ | ||
# Im4D: High-Fidelity and Real-Time Novel View Synthesis for Dynamic Scenes | ||
|
||
### [Project Page (Coming Soon)](https://zju3dv.github.io/im4d) | [Paper](https://drive.google.com/file/d/1MOixYy-TESDvcoL9Qj4V7tDvafqDmibh/view?usp=sharing) | ||
> [High-Fidelity and Real-Time Novel View Synthesis for Dynamic Scenes](https://drive.google.com/file/d/1MOixYy-TESDvcoL9Qj4V7tDvafqDmibh/view?usp=sharing) \ | ||
> Haotong Lin, Sida Peng, Zhen Xu, Tao Xie, Xingyi He, Hujun Bao and Xiaowei Zhou \ | ||
> SIGGRAPH Asia 2023 conference track | ||
![DNA-Rendering](https://github.com/haotongl/imgbed/raw/master/im4d/renbody.gif) | ||
|
||
<!-- ![ENeRF-Outdoor](https://github.com/haotongl/imgbed/raw/master/im4d/enerf.gif) --> | ||
|
||
## Installation | ||
|
||
### Set up the python environment | ||
<details> <summary>Tested with an Ubuntu workstation i9-12900K, 3090GPU</summary> | ||
|
||
``` | ||
conda create -n im4d python=3.10 | ||
conda activate im4d | ||
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia # pytorch 2.0.1 | ||
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch | ||
pip install -r requirments.txt | ||
``` | ||
</details> | ||
|
||
### Set up datasets | ||
|
||
<details> <summary>0. Set up workspace</summary> | ||
|
||
The workspace is the disk directory that stores datasets, training logs, checkpoints and results. Please ensure it has enough disk space. | ||
|
||
``` | ||
export workspace=$PATH_TO_YOUR_WORKSPACE | ||
``` | ||
</details> | ||
|
||
<details> <summary>1. Prepare ZJU-MoCap and NHR datasets.</summary> | ||
|
||
Please refer to [mlp_maps](https://github.com/zju3dv/mlp_maps/blob/master/INSTALL.md) to download ZJU-MoCap and NHR datasets. | ||
After downloading, place them into `$workspace/zju-mocap` and `$workspace/NHR`, respectively. | ||
</details> | ||
<details> <summary>2. [TODO] Prepare the DNA-Rendering dataset.</summary> | ||
|
||
This dataset was originally released last year, and it was called RenBody at that time. We used the RenBody dataset. We recently noticed that the name of this dataset has been changed to [DNA-Rendering](https://dna-rendering.github.io/index.html) and has been accepted by ICCV 2023. We are in communication with the authors of the dataset to check the latest data format and provide relevant parsers. | ||
</details> | ||
|
||
<!-- <details> <summary>3. [TODO] Prepare the dynerf dataset.</summary> --> | ||
<!-- </details> --> | ||
|
||
<!-- <details> <summary>4. [TODO] Prepare the ENeRF-Outdoor dataset.</summary> --> | ||
<!-- </details> --> | ||
|
||
### Pre-trained models | ||
|
||
Download pre-trained models from [this link](https://drive.google.com/drive/folders/1_huSP1XOG-HttZwu-JxmICrsR9YQOpkm?usp=sharing) for quick test. Place FILENAME.pth into\ | ||
`$workspace/trained_model/SCENE/im4d/FILENAME/latest.pth`. \ | ||
e.g., my_313.pth -> `$workspace/trained_model/my_313/im4d/my_313/latest.pth` \ | ||
my_313_demo.pth -> `$workspace/trained_model/my_313/im4d/my_313_demo/latest.pth`. | ||
|
||
## Testing | ||
|
||
<details> <summary>1. Reproduce the quantitative results in the paper.</summary> | ||
|
||
``` | ||
python run.py --type evaluate --cfg_file configs/exps/im4d/xx_dataset/xx_scene.yaml save_result True | ||
``` | ||
|
||
For the NHR dataset, please firstly download [the preprocessed data](https://drive.google.com/drive/folders/1rA1gzzub6TkGIuu-LaqYwwwiJm4svK2F?usp=sharing) and place them into `$workspace/evaluation`. This evaluation setting is taken from [mlp_maps](https://zju3dv.github.io/mlp_maps/). | ||
Then run one more command to report the PSNR metric: | ||
``` | ||
python scripts/evaluate/im4d/eval_nhr.py --gt_path $workspace/evaluation/sport_1_easymocap --output_path $workspace/result/sport_1_easymocap/im4d/sport1_release/default/step00999999/rgb_0 | ||
``` | ||
</details> | ||
|
||
<details> <summary>2. Accelerate the rendering speed .</summary> | ||
First, precompute the binary fields. | ||
|
||
``` | ||
python run.py --type cache_grid --cfg_file configs/exps/im4d/renbody/0013_01.yaml --configs configs/components/opts/cache_grid.yaml grid_tag default | ||
``` | ||
You may need to change the frames and grid_resolution to fit your scene. | ||
For example, the scene in ZJU-MoCap has 300 frames and its height is z-axis: | ||
``` | ||
python run.py --type cache_grid --cfg_file configs/exps/im4d/zju/my_313.yaml --configs configs/components/opts/cache_grid.yaml grid_tag default grid_resolution 128,128,256 test_dataset.frame_sample 0,300,1 | ||
``` | ||
|
||
|
||
Then, render images with the precomputed binary fields. | ||
|
||
``` | ||
python run.py --type evaluate --cfg_file configs/exps/im4d/renbody/0013_01.yaml --configs configs/components/opts/fast_render.yaml grid_tag default save_result True | ||
``` | ||
|
||
</details> | ||
|
||
|
||
<details> <summary>3. Render a video with the selected trajectory.</summary> | ||
|
||
|
||
``` | ||
python run.py --type evaluate --cfg_file configs/exps/im4d/renbody/0013_01.yaml --configs configs/components/opts/render_path/renbody_path.yaml | ||
``` | ||
We can render it with the precomputed binary fields by adding one more argument: | ||
|
||
``` | ||
python run.py --type evaluate --cfg_file configs/exps/im4d/renbody/0013_01.yaml --configs configs/components/opts/render_path/renbody_path.yaml --configs configs/components/opts/fast_render.yaml | ||
``` | ||
|
||
For better performance, you can use our pre-trained demo models which are trained with all camera views. | ||
|
||
``` | ||
python run.py --type evaluate --cfg_file configs/exps/im4d/zju/my_313.yaml --configs configs/components/opts/fast_render.yaml --configs configs/components/opts/render_path/zju_path.yaml exp_name_tag demo | ||
``` | ||
|
||
|
||
|
||
</details> | ||
|
||
## Training | ||
|
||
``` | ||
python train_net.py --cfg_file configs/exps/im4d/xx_dataset/xx_scene.yaml | ||
``` | ||
|
||
Training with multiple GPUs: | ||
``` | ||
export CUDA_VISIBLE_DEVICES=0,1,2,3 | ||
export NUM_GPUS=4 | ||
export LOG_LEVEL=WARNING # INFO, DEBUG, WARNING | ||
torchrun --nproc_per_node=$NUM_GPUS train_net.py --cfg_file configs/exps/im4d/xx_dataset/xx_scene.yaml --log_level $LOG_LEVEL distributed True | ||
``` | ||
|
||
|
||
<!-- ## Results --> | ||
<!-- We will release --> | ||
## Running on the custom dataset | ||
|
||
<details> <summary>[TODO] 1. Custom mocap datasets.</summary> | ||
</details> | ||
|
||
|
||
## Acknowledgements | ||
We would like to acknowledge the following inspring prior work: | ||
- [IBRNet: Learning Multi-View Image-Based Rendering](https://ibrnet.github.io/) (Wang et al.) | ||
- [ENeRF: Efficient Neural Radiance Fields for Interactive Free-viewpoint Video](https://zju3dv.github.io/enerf) (Lin et al.) | ||
- [K-Planes: Explicit Radiance Fields in Space, Time, and Appearance](https://sarafridov.github.io/K-Planes/) (Fridovich-Keil et al.) | ||
|
||
Big thanks to [NeRFAcc](https://www.nerfacc.com/) (Li et al.) for their efficient implementation, which has significantly accelerated our rendering. | ||
|
||
Recently, in the course of refining our codebase, we have incorporated basic implementations of ENeRF and K-Planes. These additions, although not yet thoroughly tested and aligned with the official codes, could serve as useful resources for further exploration and development. | ||
## Citation | ||
|
||
If you find this code useful for your research, please use the following BibTeX entry | ||
|
||
``` | ||
@inproceedings{lin2023im4d, | ||
title={High-Fidelity and Real-Time Novel View Synthesis for Dynamic Scenes}, | ||
author={Lin, Haotong and Peng, Sida and Xu, Zhen and Xie, Tao and He, Xingyi and Bao, Hujun and Zhou, Xiaowei}, | ||
booktitle={SIGGRAPH Asia Conference Proceedings}, | ||
year={2023} | ||
} | ||
``` |
54 changes: 54 additions & 0 deletions
54
configs/3rdparty/deeplabv3_config/_base_/datasets/ade20k.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
# dataset settings | ||
dataset_type = 'ADE20KDataset' | ||
data_root = 'data/ade/ADEChallengeData2016' | ||
img_norm_cfg = dict( | ||
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) | ||
crop_size = (512, 512) | ||
train_pipeline = [ | ||
dict(type='LoadImageFromFile'), | ||
dict(type='LoadAnnotations', reduce_zero_label=True), | ||
dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)), | ||
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), | ||
dict(type='RandomFlip', prob=0.5), | ||
dict(type='PhotoMetricDistortion'), | ||
dict(type='Normalize', **img_norm_cfg), | ||
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), | ||
dict(type='DefaultFormatBundle'), | ||
dict(type='Collect', keys=['img', 'gt_semantic_seg']), | ||
] | ||
test_pipeline = [ | ||
dict(type='LoadImageFromFile'), | ||
dict( | ||
type='MultiScaleFlipAug', | ||
img_scale=(2048, 512), | ||
# img_ratios=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75], | ||
flip=False, | ||
transforms=[ | ||
dict(type='Resize', keep_ratio=True), | ||
dict(type='RandomFlip'), | ||
dict(type='Normalize', **img_norm_cfg), | ||
dict(type='ImageToTensor', keys=['img']), | ||
dict(type='Collect', keys=['img']), | ||
]) | ||
] | ||
data = dict( | ||
samples_per_gpu=4, | ||
workers_per_gpu=4, | ||
train=dict( | ||
type=dataset_type, | ||
data_root=data_root, | ||
img_dir='images/training', | ||
ann_dir='annotations/training', | ||
pipeline=train_pipeline), | ||
val=dict( | ||
type=dataset_type, | ||
data_root=data_root, | ||
img_dir='images/validation', | ||
ann_dir='annotations/validation', | ||
pipeline=test_pipeline), | ||
test=dict( | ||
type=dataset_type, | ||
data_root=data_root, | ||
img_dir='images/validation', | ||
ann_dir='annotations/validation', | ||
pipeline=test_pipeline)) |
14 changes: 14 additions & 0 deletions
14
configs/3rdparty/deeplabv3_config/_base_/default_runtime.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
# yapf:disable | ||
log_config = dict( | ||
interval=50, | ||
hooks=[ | ||
dict(type='TextLoggerHook', by_epoch=False), | ||
# dict(type='TensorboardLoggerHook') | ||
]) | ||
# yapf:enable | ||
dist_params = dict(backend='nccl') | ||
log_level = 'INFO' | ||
load_from = None | ||
resume_from = None | ||
workflow = [('train', 1)] | ||
cudnn_benchmark = True |
44 changes: 44 additions & 0 deletions
44
configs/3rdparty/deeplabv3_config/_base_/models/deeplabv3_r50-d8.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# model settings | ||
norm_cfg = dict(type='SyncBN', requires_grad=True) | ||
model = dict( | ||
type='EncoderDecoder', | ||
pretrained='open-mmlab://resnet50_v1c', | ||
backbone=dict( | ||
type='ResNetV1c', | ||
depth=50, | ||
num_stages=4, | ||
out_indices=(0, 1, 2, 3), | ||
dilations=(1, 1, 2, 4), | ||
strides=(1, 2, 1, 1), | ||
norm_cfg=norm_cfg, | ||
norm_eval=False, | ||
style='pytorch', | ||
contract_dilation=True), | ||
decode_head=dict( | ||
type='ASPPHead', | ||
in_channels=2048, | ||
in_index=3, | ||
channels=512, | ||
dilations=(1, 12, 24, 36), | ||
dropout_ratio=0.1, | ||
num_classes=19, | ||
norm_cfg=norm_cfg, | ||
align_corners=False, | ||
loss_decode=dict( | ||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), | ||
auxiliary_head=dict( | ||
type='FCNHead', | ||
in_channels=1024, | ||
in_index=2, | ||
channels=256, | ||
num_convs=1, | ||
concat_input=False, | ||
dropout_ratio=0.1, | ||
num_classes=19, | ||
norm_cfg=norm_cfg, | ||
align_corners=False, | ||
loss_decode=dict( | ||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)), | ||
# model training and testing settings | ||
train_cfg=dict(), | ||
test_cfg=dict(mode='whole')) |
9 changes: 9 additions & 0 deletions
9
configs/3rdparty/deeplabv3_config/_base_/schedules/schedule_160k.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# optimizer | ||
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005) | ||
optimizer_config = dict() | ||
# learning policy | ||
lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False) | ||
# runtime settings | ||
runner = dict(type='IterBasedRunner', max_iters=160000) | ||
checkpoint_config = dict(by_epoch=False, interval=16000) | ||
evaluation = dict(interval=16000, metric='mIoU', pre_eval=True) |
2 changes: 2 additions & 0 deletions
2
configs/3rdparty/deeplabv3_config/deeplabv3_r101-d8_512x512_160k_ade20k.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
_base_ = './deeplabv3_r50-d8_512x512_160k_ade20k.py' | ||
model = dict(pretrained='open-mmlab://resnet101_v1c', backbone=dict(depth=101)) |
6 changes: 6 additions & 0 deletions
6
configs/3rdparty/deeplabv3_config/deeplabv3_r50-d8_512x512_160k_ade20k.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
_base_ = [ | ||
'./_base_/models/deeplabv3_r50-d8.py', './_base_/datasets/ade20k.py', | ||
'./_base_/default_runtime.py', './_base_/schedules/schedule_160k.py' | ||
] | ||
model = dict( | ||
decode_head=dict(num_classes=150), auxiliary_head=dict(num_classes=150)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
train_dataset_module: lib.datasets.volcap.base_dataset | ||
test_dataset_module: lib.datasets.volcap.base_dataset | ||
render_path: False # whether to render path | ||
num_pixels: 1024 # number of pixels to sample for each image during each tarining iteration | ||
white_bkgd: True | ||
dataset_cfg: &dataset_cfg | ||
data_root: 'renbody' | ||
img_dir: 'images' | ||
img_frame_format: '{:06d}.jpg' | ||
msk_dir: 'maskes' | ||
msk_frame_format: '{:06d}.jpg' | ||
resize_ratio: 0.5 | ||
special_resize_ratio: 0.375 | ||
special_views: [48, 60, 1, -1] # if -1 in special_views, then special_views = np.arange(48, 60, 1) else special_views = special_views | ||
crop_h_w: [900, 600] | ||
input_view_sample: [0, 60, 1, -1] | ||
render_view_sample: [0, 60, 1, -1] | ||
test_views: [11, 25, 37, 57] | ||
preload_data: True | ||
imgs_per_batch: 8 | ||
ignore_dist_k3: False | ||
bbox_type: 'RENBODY' | ||
shift_pixel: False | ||
near_far: [0.1, 100.] | ||
train_dataset: | ||
<<: *dataset_cfg | ||
split: 'train' | ||
frame_sample: [0, 150, 1] | ||
test_dataset: | ||
<<: *dataset_cfg | ||
split: 'test' | ||
frame_sample: [0, 150, 20] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
train_dataset_module: lib.datasets.volcap.ibr_dataset | ||
test_dataset_module: lib.datasets.volcap.ibr_dataset | ||
dataset_cfg: &dataset_cfg | ||
train_input_views: [2, 3, 4, 5] | ||
train_input_views_prob: [0.1, 0.35, 0.45, 0.1] | ||
test_input_views: 4 | ||
crop_srcinps: True | ||
crop_padding: 5 | ||
crop_align: 16 | ||
imgs_per_batch: 1 | ||
train_dataset: | ||
<<: *dataset_cfg | ||
test_dataset: | ||
<<: *dataset_cfg |
Oops, something went wrong.