This repository contains the source code of Predicting Next Local Appearance for Video Anomaly Detection, accepted for MVA 2021 by Pankaj Raj Roy, Guillaume-Alexandre Bilodeau and Lama Seoud. The corresponding slides, poster and video will be available soon.
We used TensorFlow/Keras to implement our proposed framework. Please follow the given instructions to run the code.
- Install Anaconda with Python 3.6 (since our code uses f-string)
- Install TensorFlow v1.12.0 and Keras v2.2.4
- Install keras-contrib (necessary for using InstanceNormalization layer and SSIM loss):
pip install git+https://www.github.com/keras-team/keras-contrib.git
- Install any other necessary libraries: scikit-learn, pandas, matplotlib, opencv-python, etc.
- Follow the instructions for installing CenterNet. Note that we used PyTorch v1.0.1, Cuda v10 and GCC v4.9.1 in order to successfully install CenterNet.
- Define
CenterNet_ROOT
andPretrained_ROOT
in your.bashrc
containing the path of the locally installed CenterNet repository and the pretrained weights respectively. - Download the MS COCO pretrained weights corresponding to DLA and HG backbones which can be found in Model zoo: ctdet_coco_dla_2x.pth and ctdet_coco_hg.pth. Store them in
$Pretrained_ROOT/CenterNet/
.
- Define
Datasets_ROOT
in your.bashrc
containing the path location of datasets. - Use the link provided by StevenLiuWen or by BaiduYun (pass: i9b3) to manually download the four datasets: ped1.tar.gz, ped2.tar.gz, avenue.tar.gz and shanghaitech.tar.gz. Unzip each file and store them in
$Datasets_ROOT/
.
- Define
Estimations_ROOT
andTemp_ROOT
in your.bashrc
containing the path of the estimated bounding boxes and the temporary folder respectively. - Run the script
extract_estimations.py
for each dataset (ped1, ped2, avenue and shanghaitech (ST)) and for each MOD backbone (DLA and HG). For example, use the following arguments to extract bounding boxes for ST using DLA MOD backbone:
python extract_estimations.py -d shanghaitech --det_arch dla_34
- To train the generative model from scratch, simply run the script
train_test_model.py
with arguments specifying the name of the dataset and the values of hyper-parameters. This will first train the model using the train set of the given dataset and then test the trained model using the test set of the same dataset. For example:
python train_test_model.py -d shanghaitech
- Once the training is complete, the models will be saved in
models/YYYY-MM-DD_HHhMMmSSs/
whereYYYY-MM-DD_HHhMMmSSs
specifies the time the script started. Once the testing is complete, the results will be saved inresults/train_test_model/YYYY-MM-DD_HHhMMmSSs/
. - To reduce the resulting data size, we can increase the value of
frames_step
. For example,--frames_step 10
will take a frame for every 10 consecutive frames for every video. - To increase adversarial training stability, we employ some tricks like
n_inner_epochs
andchange_random_subset
. For example,--n_inner_epochs 4
trains the discriminator for 4 epochs before training the generator for 4 epochs alternatively.--change_random_subset
first trains the model on complete train set for 20 epochs and then changes the training data to a randomly selected subset of the complete train set for every 20 epochs.
- To test this without training from scratch, download the pretrained model on ST.
- For ped1, ped2 and avenue, use
n_frames_per_video
to specify the number of shots. For example, for testing adaptation performance on ped2, use the following commands for 1-shot, 5-shot and 10-shot respectively:
Shots | Command |
---|---|
1-shot | python train_test_model.py --pretrained_model 2021-04-06_06h08m45s --random_subset_training -d ped2 --n_frames_per_video 1 --batch_size 4 |
5-shot | python train_test_model.py --pretrained_model 2021-04-06_06h08m45s --random_subset_training -d ped2 --n_frames_per_video 5 --batch_size 16 |
10-shot | python train_test_model.py --pretrained_model 2021-04-06_06h08m45s --random_subset_training -d ped2 --n_frames_per_video 10 |
- To only test a pretrained model on a given dataset (e.g. ped2), use the following command:
python train_test_model.py --inference_mode --pretrained_model 2021-04-06_06h08m45s -d ped2
- Note that the minimum MOD detection scores (
min_train_det_score
andmin_test_det_score
) which ignores any bounding boxes having a MOD score lowermin_train_det_score
ormin_test_det_score
might affect the VAD performance significantly.
- This code will soon be converted to the recent version of TensorFlow (2.5) which gives a significant boost in training speed of Keras model.
- More details coming soon.
Please cite our paper if you find this project useful.
@inproceedings{roy2021nlap,
title={Predicting Next Local Appearance for Video Anomaly Detection},
author={Pankaj Raj Roy and Guillaume-Alexandre Bilodeau and Lama Seoud},
booktitle={arXiv preprint arXiv:2106.06059},
year={2021}
}