Flow-based-Video-Segmentation-Algorithm

We proposed a novel flow-based encoder-decoder network to detect a human head and shoulders from a video and remove the background to create elegant media for videoconferencing and virtual reality applications.

This is the repository to the paper Flow-based Video Segmentation for Human Head and Shoulders by Zijian Kuang and Xinran Tie.

Getting Started

You will need Python 3.6 and the packages specified in requirements.txt. We recommend setting up a virtual environment with pip and installing the packages there. The correlation layer is implemented in CUDA using CuPy, which is why CuPy is a required dependency. It can be installed using pip install cupy or alternatively using one of the provided binary packages as outlined in the CuPy repository.

Install packages with:

$ pip install -r requirements.txt

Or install with for Windows as per PyTorch official site:

$ pip install torch===1.6.0 torchvision===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html
$ pip install -r requirements.txt

Dataset

We created our own video segmentation dataset. The dataset includes four online conference style green screen videos. We extracted the data from video and generated ground truth mask for each character, and then we applied virtual background to the frames as our training/testing dataset. You can download the dataset from this link . The data examples are shown as below:

Input image 1:	Input image 2:
Ground truth 1:	Ground truth 2:

To use our code to generate more video segmentation data and groudtruth, you can use the functions in dataset_generator.py

Configure and Run the Code

To train our model:

Create folder structure like the example shows in the picture below, and then dump the training data into the original_training folder, and dump the ground truth data into the ground_truth_training folder:

Run the training code:

python funet_train.py

optional arguments:
  -h, --help            show this help message and exit
  -e E, --epochs E      Number of epochs (default: 10)
  -b [B], --batch-size [B]
                        Batch size (default: 1)
  -l [LR], --learning-rate [LR]
                        Learning rate (default: 0.0001)
  -f LOAD, --load LOAD  Load model from a .pth file (default: False)
  -s SCALE, --scale SCALE
                        Downscaling factor of the dataset (default: 1)
  -v VAL, --validation VAL
                        Percent of the data that is used as validation (0-100)
                        (default: 20.0)
  -g GPU, --gpu GPU     Set the gpu for cuda (default: 0)

To predict using our model:

Dump the testing data into the original_testing folder, and dump the ground truth data into the ground_truth_testing folder.
Run the predicting code:

python funet_predict.py

optional arguments:
  -h, --help            show this help message and exit
  --model FILE, -m FILE
                        Specify the file in which the model is stored
                        (default: checkpoints/CP_epoch7.pth)
  --img INPUT [INPUT ...], -img INPUT [INPUT ...]
                        Path of original image dataset (default:
                        dataset/original_testing/)
  --mask INPUT [INPUT ...], -mask INPUT [INPUT ...]
                        Path of ground truth mask dataset (default:
                        dataset/ground_truth_testing/)
  --output INPUT [INPUT ...], -o INPUT [INPUT ...]
                        path of ouput dataset (default: dataset/mask_output/)
  --no-viz, -v          No visualize the dataset as they are processed
                        (default: False)
  --no-save, -n         Do not save the output masks (default: False)
  --no-eval, -e         Do not run evaluation. (default: False)
  --mask-threshold MASK_THRESHOLD, -t MASK_THRESHOLD
                        Minimum probability value to consider a mask pixel
                        white (default: 0.5)
  --scale SCALE, -s SCALE
                        Scale factor for the input dataset (default: 1)
  -g GPU, --gpu GPU     Set the gpu for cuda (default: 0)

Credits

We want to thank the work of the pythorch-pwc that implemented by sniklaus, we have used the pytorch-pwc to estimate optical flow in our project.

Citation

[1]  @inproceedings{Sun_CVPR_2018,
         author = {Deqing Sun and Xiaodong Yang and Ming-Yu Liu and Jan Kautz},
         title = {{PWC-Net}: {CNNs} for Optical Flow Using Pyramid, Warping, and Cost Volume},
         booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
         year = {2018}
     }

[2]  @misc{pytorch-pwc,
         author = {Simon Niklaus},
         title = {A Reimplementation of {PWC-Net} Using {PyTorch}},
         year = {2018},
         howpublished = {\url{https://github.com/sniklaus/pytorch-pwc}}
    }

[3]  @misc{U-Net,
         author = {Olaf Ronneberger, Philipp Fischer, Thomas Brox},
         title = {U-Net: Convolutional Networks for Biomedical Image Segmentation},
         year = {2015},
         howpublished = {\url{https://arxiv.org/abs/1505.04597}}
    }

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 140 Commits
checkpoints		checkpoints
comparison		comparison
correlation		correlation
dataset		dataset
readme_imgs		readme_imgs
unet		unet
utils		utils
LICENSE		LICENSE
README.md		README.md
dataset_generator.py		dataset_generator.py
dice_loss.py		dice_loss.py
eval.py		eval.py
funet_predict.py		funet_predict.py
funet_train.py		funet_train.py
model.py		model.py
pwc_net.py		pwc_net.py
requirements.txt		requirements.txt
softsplat.py		softsplat.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flow-based-Video-Segmentation-Algorithm

Getting Started

Dataset

Configure and Run the Code

Credits

Citation

License

About

Releases

Packages

Languages

License

kuangzijian/Flow-Based-Video-Segmentation

Folders and files

Latest commit

History

Repository files navigation

Flow-based-Video-Segmentation-Algorithm

Getting Started

Dataset

Configure and Run the Code

Credits

Citation

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages